Effector proteins and methods of use

ABSTRACT

Provided herein are compositions, systems, and methods comprising effector proteins and uses thereof. These effector proteins are shown to be active with guide RNAs and may be characterized as CRISPR-associated (Cas) proteins. Various compositions, systems, and methods of the present disclosure leverage the activities of these effector proteins for the modification, detection, and engineering of nucleic acids.

CROSS REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of U.S. Provisional Application No.63/186,700, filed May 10, 2021, U.S. Provisional Application No.63/220,137, filed Jul. 9, 2021, U.S. Provisional Application No.63/220,286, filed Jul. 9, 2021, U.S. Provisional Application No.63/290,600, filed Dec. 16, 2021, and U.S. Provisional Application No.63/316,358, filed Mar. 3, 2022, the disclosures of which areincorporated herein by reference in their entirety.

SEQUENCE LISTING

The instant application contains a Sequence Listing which has beensubmitted electronically in ASCII format and is hereby incorporated byreference in its entirety. Said ASCII copy, created on May 10, 2022, isnamed 203477-709601_ST25.txt and is 820,342 bytes in size.

BACKGROUND

Programmable nucleases are proteins that bind and cleave nucleic acidsin a sequence-specific manner. A programmable nuclease may bind a targetregion of a nucleic acid and cleave the nucleic acid within the targetregion or at a position adjacent to the target region. In someinstances, a programmable nuclease is activated when it binds a targetregion of a nucleic acid to cleave regions of the nucleic acid that arenear, but not adjacent to the target region. A programmable nuclease,such as a CRISPR-associated (Cas) protein, may be coupled to a guidenucleic acid that imparts activity or sequence selectivity to theprogrammable nuclease. In general, guide nucleic acids comprise a CRISPRRNA (crRNA) that is at least partially complementary to a target nucleicacid. In some cases, guide nucleic acids comprise a trans-activatingcrRNA (tracrRNA), at least a portion of which interacts with theprogrammable nuclease. In some cases, guide nucleic acids comprise arepeat region or a handle region, at least a portion of which interactswith the programmable nuclease, wherein a handle region comprises atleast a portion of a repeat region. In some cases, a tracrRNA orintermediary RNA is provided separately from the guide nucleic acid. ThetracrRNA, repeat region, handle region, or any combination thereof mayhybridize to a portion of the guide nucleic acid that does not hybridizeto the target nucleic acid.

Programmable nucleases may cleave nucleic acids, including singlestranded RNA (ssRNA), double stranded DNA (dsDNA), and single-strandedDNA (ssDNA). Programmable nucleases may provide cis cleavage activity,trans cleavage activity, nickase activity, or a combination thereof. Ciscleavage activity is cleavage of a target nucleic acid that ishybridized to a guide RNA (crRNA or sgRNA), wherein cleavage occurswithin or directly adjacent to the region of the target nucleic acidthat is hybridized to guideRNA. Trans cleavage activity (also referredto as transcollateral cleavage) is cleavage of ssDNA or ssRNA that isnear, but not hybridized to the guide RNA. Trans cleavage activity istriggered by the hybridization of guide RNA to the target nucleic acid.Nickase activity is the selective cleavage of one strand of a dsDNAmolecule. While certain programmable nucleases may be used to edit anddetect nucleic acid molecules in a sequence specific manner, challengingbiological sample conditions (e.g., high viscosity, metal chelating) maylimit their accuracy and effectiveness. There is thus a need for systemsand methods that employ programmable nucleases having specificity andefficiency across a wide range of sample conditions.

SUMMARY

The present disclosure provides compositions, systems, and methodscomprising effector protein and uses thereof. In general, the effectorproteins are DNA modifying, are dual-guided (require a crRNA andtracrRNA, or a single guide RNA comprising portions of each, foractivity), and are short (less than 700 linked amino acids in length).Thus, they are referred to herein as D2S effector proteins.Compositions, systems and methods disclosed herein leverage the nucleicacid modifying activities (e.g., cis cleavage activity andtrans-collateral cleavage activity) of these D2S effector proteins forthe modification, detection and engineering of target nucleic acids.

While other short, also referred to as “compact,” effectors may be knownin the art, these D2S effectors are particularly compact, the majoritybeing less than 500 amino acids in length, and several being less than400 amino acids in length. This makes them particularly useful fordelivery via viral vectors (e.g., AAV), where additional CRISPR systemcomponents, (e.g., guide RNA(s), donor nucleic acid, and promoters), maybe incorporated into the same viral vector, thereby enabling moreefficient viral production. Small size is especially useful forself-complementary AAV (scAAV) systems which have a very limited cargosize. In addition to their compact nature, they provide the ability tomodify additional or alternative sequences relative to known effectors,due to their ability to recognize a variety of protospacer adjacentmotifs (PAMs), see, e.g., Table 35. Many of the D2S effectors disclosedherein have high identity and similarity to CasM.19952, which hasdemonstrated “blunt” cutting, and may also provide blunt or shortstagger cut ends. Blunt cutting may be advantageous over the staggeredcutting that is provided by other nucleases, as there is a less likelychance of spontaneous (also referred to as perfect) repair which maydecrease the chances of successful target modification and/or donorinsertion.

I. Certain Embodiments

Provided herein are compositions comprising an effector protein and anengineered guide nucleic acid, wherein the effector protein comprises anamino acid sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, at least 98%, at least 99% or 100% identical toSEQ ID NO: 1.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 2.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 3.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 4.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 5.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 6.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 7.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 8.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 9.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 10.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 11.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 12.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 13.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 14.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 15.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 16.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 17.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 18.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 19.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 20.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 21.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 22.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 23.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 24.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 25.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 26.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 27.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 28.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 29.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 30.

Also provided herein, is a composition comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 31.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 32.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 33.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 34.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 35.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 36.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 37.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 38.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 39.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 40.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 41.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 42.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 43.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 44.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 45.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 202.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 203.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 204.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 205.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 206.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 207.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 208.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 209.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 210.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 211.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 212.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 213.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 214.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 215.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 216.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 217.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 218.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 219.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 220.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 221.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 222.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 223.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 224.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 225.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 226.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 227.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 228.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 229.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 230.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 231.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 232.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 233.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 234.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 235.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 236.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 237.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 238.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 239.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 240.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 728.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 729.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 730.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 731.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein comprises one or more amino acid alteration.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein comprises one or more amino acid alteration in oneor more domain comprising a REC domain, RuvC-I domain, or a RuvC-IIdomain.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein comprises: one or more amino acid alteration at aposition corresponding to 110, 111, 112, 113, 114, 115, 116, 117, 118,119, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, or 132 in a RECdomain; one or more amino acid alteration at a position corresponding to261, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275,276, 277, 278, 279, 280, 281 or 282 in a RuvC-I domain; one or moreamino acid alteration at a position corresponding to 457, 458, 459, 460,461, 462, 463, 464, 466, 467 or 468 in a RuvC-II domain; or anycombination thereof.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein comprises one or more amino acid alteration: T115R,T124R, L126R, E127R, T128R, N129R, or A132R in a REC domain; K261R,V263R, T278R, T281R, or E282R in a RuvC-I domain; N459R, S460R, D462R,K466R, N467R, or E468R in a RuvC-II domain; or any combination thereof.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein comprises one or more amino acid alteration at aposition corresponding to one or more residue A110, T111, E112, M113,S114, T115, Q116, S117, L118, S119, F122, A123, T124, E125, L126, E127,T128, N129, I130, F131, A132, K261, V263, V264, G265, V266, D267, L268,G269, I270, N271, V272, P273, A274, Y275, V276, A277, T278, N279, I280,T281, E282, E363, I457, A458, N459, S460, K461, D462, I463, I464, K466,N467, E468, or any combination thereof.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector protein aminoacid sequence comprises one or more amino acid alteration comprising oneor more of A110R, T111R, E112R, M113R, S114R, T115R, Q116R, S117R,L118R, S119R, F122R, A123R, T124R, E125R, L126R, E127R, T128R, N129R,I130R, F131R, A132R, K261R, V263R, V264R, G265R, V266R, D267R, D267A,D267N, L268R, G269R, I270R, N271R, V272R, P273R, A274R, Y275R, V276R,A277R, T278R, N279R, I280R, T281R, E282R, E363Q, I457R, A458R, N459R,S460R, K461R, D462R, I463R, I464R, K466R, N467R, E468R or anycombination thereof.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector protein aminoacid sequence comprises one or more amino acid alteration comprising oneor more of D267A, E363Q, or both.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector protein aminoacid sequence comprises one or more amino acid alteration comprising oneor more of D267N, E363Q, or both.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector protein aminoacid sequence comprises one or more amino acid alteration comprising oneor more of T115R, T124R, L126R, E127R, T128R, N129R, A132R, K261R,V263R, T278R, T281R, E282R, N459R, S460R, D462R, K466R, N467R, E468R orany combination thereof.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector protein aminoacid sequence comprises one or more amino acid alteration comprising oneor more of T124R, T128R, N129R, T278R, E282R, T281R, or any combinationthereof.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein comprises a T124R, T128R or N129R amino acidalteration.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein comprises a T278R, E282R, or T281R amino acidalteration.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a A110R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 241.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a T111R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 242.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a E112R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 243.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a M113R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 244.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a S114R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 245.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a T115R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 246.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a Q116R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 247.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a S117R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 248.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a L118R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 249.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a S119R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 250.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a F122R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 251.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a A123R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 252.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a T124R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 253.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a E125R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 254.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a L126R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 255.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a E127R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 256.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a T128R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 257.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a N129R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 258.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a I130R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 259.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a F131R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 260.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a A132R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 261.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a K261R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 262.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a V263R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 263.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a V264R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 264.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a G265R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 265.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a V266R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 266.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a D267R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 267.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a L268R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 268.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a G269R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 269.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a I270R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 270.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a N271R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 271.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a V272R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 272.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a P273R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 273.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a A274R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 274.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a Y275R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 275.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a V276R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 276.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a A277R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 277.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a T278R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 278.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a N279R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 279.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a I280R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 280.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a T281R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 281.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a E282R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 282.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a I457R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 283.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a A458R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 284.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a N459R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 285.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a S460R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 286.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a K461R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 287.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a D462R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 288.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a I463R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 289.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a I464R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 290.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a K466R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 291.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a N467R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 292.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a E468R amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 293.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a D267A amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 728.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a D267A amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 729.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a D267N amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 730.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the amino acid sequence ofthe effector protein, other than a E363Q amino acid alteration, is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 731.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to any one of SEQ ID NOs: 1-13, and wherein theengineered guide nucleic acid comprises: (i) a crRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to any one of SEQ ID NOs:46-58, (ii) a tracrRNA comprising a nucleobase sequence that is at least75%, at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to any one of SEQ ID NOs: 91-103, or (iii) a combinationthereof.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to any one of SEQ ID NOs: 14-21, and wherein theengineered guide nucleic acid comprises: (i) a crRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to any one of SEQ ID NOs:59-66, (ii) a tracrRNA comprising a nucleobase sequence that is at least75%, at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to any one of SEQ ID NOs: 104-119, or (iii) a combinationthereof.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to any one of SEQ ID NOs: 22-34, and wherein theengineered guide nucleic acid comprises: (i) a crRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to any one of SEQ ID NOs:67-79, (ii) a tracrRNA comprising a nucleobase sequence that is at least75%, at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to any one of SEQ ID NOs: 120-127, or (iii) a combinationthereof.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to any one of SEQ ID NOs: 35-45, and wherein theengineered guide nucleic acid comprises: (i) a crRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to any one of SEQ ID NOs:80-90, (ii) a tracrRNA comprising a nucleobase sequence that is at least75%, at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to any one of SEQ ID NOs: 128-148, or (iii) a combinationthereof.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 1, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 46 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 91.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 2, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 47 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 92.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 3, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 48 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 93.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 4, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 49 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 94.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99%, or100% identical to SEQ ID NO: 5, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 50 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 95.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 6, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 51 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 96.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 7, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 52 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 97.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 8, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 53 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 98.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 9, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 54 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 99.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 10, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 55 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 100.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 11, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 56 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 101.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 12, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 57 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 102.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 13, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 58 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 103.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 14, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 59 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 104.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 14, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 59 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 105.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 15, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 60 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 106.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 15, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 60 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 107.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 16, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 61 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 108.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 16, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 61 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 109.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 17, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 62 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 110.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 17, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 62 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 111.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 18, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 63 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 112.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 18, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 63 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 113.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 19, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 64 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 114.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 19, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 64 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 115.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 20, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 65 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 116.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 20, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 65 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 117.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 21, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 66 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 118.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 21, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 66 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 119.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 22, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 67 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 120.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 23, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 68 and (ii) atracrRNA comprising a nucleobase sequence that is at least 75%, at least80%, at least 85%, at least 90%, at least 95%, or 100% identical to SEQID NO: 120.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 24, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 69 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 121.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 25, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 70 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 122.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 26, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 71 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 120.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 27, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 72 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 123.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 28, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 73 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 121.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 22, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 74 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 121.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 30, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 75 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 124.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 31, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 76 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 122.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 32, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 77 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 124.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 33, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 78 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 125.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 33, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 78 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 126.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 34, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 79 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 127.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 35, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 80 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 128.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 35, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 80 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 129.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 36, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 81 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 130.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 36, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 81 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 131.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 37, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 82 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 132.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 37, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 82 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 133.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 38, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 83 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 134.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 38, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 83 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 135.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 39, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 84 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 136.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 39, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 84 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 137.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 40, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 85 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 138.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 41, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 86 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 139.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 41, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 86 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 140.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 42, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 87 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 141.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 42, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 87 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 142.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 43, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 88 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 143.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 43, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 88 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 144.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 44, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 89 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 145.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 44, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 89 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 146.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 45, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 90 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 147.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 45, wherein the engineered guide nucleicacid comprises: (i) a crRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 90 and (ii) a tracrRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 148.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 22, wherein the engineered guide nucleicacid comprises: a single guide RNA comprising a nucleobase sequence thatis at least 75%, at least 80%, at least 85%, at least 90%, at least 95%,at least 98%, at least 99% or 100% identical to SEQ ID NO: 149.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 23, wherein the engineered guide nucleicacid comprises: a single guide RNA comprising a nucleobase sequence thatis at least 75%, at least 80%, at least 85%, at least 90%, at least 95%,at least 98%, at least 99% or 100% identical to SEQ ID NO: 149.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 24, wherein the engineered guide nucleicacid comprises: a single guide RNA comprising a nucleobase sequence thatis at least 75%, at least 80%, at least 85%, at least 90%, at least 95%,at least 98%, at least 99% or 100% identical to SEQ ID NO: 150.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 25, wherein the engineered guide nucleicacid comprises: a single guide RNA comprising a nucleobase sequence thatis at least 75%, at least 80%, at least 85%, at least 90%, at least 95%,at least 98%, at least 99% or 100% identical to SEQ ID NO: 151.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 26, wherein the engineered guide nucleicacid comprises: a single guide RNA comprising a nucleobase sequence thatis at least 75%, at least 80%, at least 85%, at least 90%, at least 95%,at least 98%, at least 99% or 100% identical to SEQ ID NO: 149.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 28, wherein the engineered guide nucleicacid comprises: a single guide RNA comprising a nucleobase sequence thatis at least 75%, at least 80%, at least 85%, at least 90%, at least 95%,at least 98%, at least 99% or 100% identical to SEQ ID NO: 150.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 29, wherein the engineered guide nucleicacid comprises: a single guide RNA comprising a nucleobase sequence thatis at least 75%, at least 80%, at least 85%, at least 90%, at least 95%,at least 98%, at least 99% or 100% identical to SEQ ID NO: 150.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 30, wherein the engineered guide nucleicacid comprises: a single guide RNA comprising a nucleobase sequence thatis at least 75%, at least 80%, at least 85%, at least 90%, at least 95%,at least 98%, at least 99% or 100% identical to SEQ ID NO: 152.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 31, wherein the engineered guide nucleicacid comprises: a single guide RNA comprising a nucleobase sequence thatis at least 75%, at least 80%, at least 85%, at least 90%, at least 95%,at least 98%, at least 99% or 100% identical to SEQ ID NO: 151.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 32, wherein the engineered guide nucleicacid comprises: a single guide RNA comprising a nucleobase sequence thatis at least 75%, at least 80%, at least 85%, at least 90%, at least 95%,at least 98%, at least 99% or 100% identical to SEQ ID NO: 152.

Also provided herein, are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 33, wherein the engineered guide nucleicacid comprises: a single guide RNA comprising a nucleobase sequence thatis at least 75%, at least 80%, at least 85%, at least 90%, at least 95%,at least 98%, at least 99% or 100% identical to SEQ ID NO: 153.

Also provided herein, are any one of the compositions disclosed herein,wherein the crRNA and the tracrRNA are linked in a single guide RNA.

Also provided herein, are any one of the compositions disclosed herein,wherein the effector protein comprises a nuclear localization signal.

Also provided herein, are a pharmaceutical composition, comprising anyone of the compositions disclosed herein and a pharmaceuticallyacceptable excipient.

Also provided herein, are systems comprising any one of the compositionsdisclosed herein. In some embodiments, the system comprises at least onedetection reagent for detecting a target nucleic acid. In someembodiments, the at least one detection reagent is selected from areporter nucleic acid, a detection moiety, an additional effectorprotein, or a combination thereof, optionally wherein the reporternucleic acid comprises a fluorophore, a quencher, or a combinationthereof. In some embodiments, the system further comprises at least oneamplification reagent for amplifying a target nucleic acid. In someembodiments, at least one amplification reagent is selected from thegroup consisting of a primer, a polymerase, an activator, a dNTP, anrNTP, and combinations thereof. In some embodiments, the target nucleicacid comprises a PAM sequence selected from any one of SEQ ID NOs:256-270. In some embodiments, the target nucleic acid comprises a PAMsequence selected from any one of SEQ ID NOs: 301-371.

Also provided herein are methods of detecting a target nucleic acid in asample, comprising contacting the sample with any one of thecompositions disclosed herein or any one of the systems disclosedherein, thereby generating a modification of the target nucleic acid;and detecting the modification. In some embodiments, the methods cancomprise the steps of: (a) contacting the sample with: (i) any one ofthe compositions disclosed herein or any one of the systems disclosedherein; and (ii) a reporter nucleic acid comprising a detectable moietythat produces a detectable signal in the presence of the target nucleicacid and the composition or system, and (b) detecting the detectablesignal. In some embodiments, the target nucleic acid comprises a PAMsequence selected from any one of SEQ ID NOs: 256-270. In someembodiments, the target nucleic acid comprises a PAM sequence selectedfrom any one of SEQ ID NOs: 301-371.

In some embodiments, (i) the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:1; (ii) the crRNA comprises a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to SEQ ID NO: 46; (iii) the tracrRNA comprises a nucleobasesequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 91. In some instances, thetarget nucleic acid has a PAM sequence of CTT (SEQ ID NO: 154).

In some embodiments, the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:13; (i) the crRNA comprises a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to SEQ ID NO: 58; (ii) the tracrRNA comprises a nucleobasesequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 103; and (iii) the targetnucleic acid has a PAM sequence of CTT (SEQ ID NO: 154).

In some embodiments, (i) the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:15; (ii) the crRNA comprises a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to SEQ ID NO: 60; (iii) the tracrRNA comprises a nucleobasesequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 106; and (iv) the targetnucleic acid has a PAM sequence of CC (SEQ ID NO: 155). In someembodiments, (i) the effector protein comprises an amino acid sequencethat is at least 75%, at least 80%, at least 85%, at least 90%, at least95%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 15; (ii)the crRNA comprises a nucleobase sequence that is at least 75%, at least80%, at least 85%, at least 90%, at least 95%, or 100% identical to SEQID NO: 60; (iii) the tracrRNA comprises a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 107; and (iv) the target nucleic acid has aPAM sequence of CC (SEQ ID NO: 155).

In some embodiments, (i) the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:22; (ii) the crRNA comprises a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to SEQ ID NO: 67; (iii) the tracrRNA comprises a nucleobasesequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 120; and (iv) the targetnucleic acid has a PAM sequence of GCG (SEQ ID NO: 157). In someembodiments (i) the effector protein comprises an amino acid sequencethat is at least 75%, at least 80%, at least 85%, at least 90%, at least95%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 22; (ii)the sgRNA comprises a nucleobase sequence that is at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 98%, at least99% or 100% identical to SEQ ID NO: 149; and (iii) the target nucleicacid has a PAM sequence of TCG (SEQ ID NO: 156).

In some embodiments (i) the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:23; (ii) the crRNA comprises a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to SEQ ID NO: 68; (iii) the tracrRNA comprises a nucleobasesequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 120; and (iv) the targetnucleic acid has a PAM sequence of TCG (SEQ ID NO: 156), TTG (SEQ ID NO:158), GCG (SEQ ID NO: 157), or GTG (SEQ ID NO: 159). In someembodiments, (i) the effector protein comprises an amino acid sequencethat is at least 75%, at least 80%, at least 85%, at least 90%, at least95%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 23; (ii)the sgRNA comprises a nucleobase sequence that is at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 98%, at least99% or 100% identical to SEQ ID NO: 149; and (iii) the target nucleicacid has a PAM sequence of TCG (SEQ ID NO: 156), TTG (SEQ ID NO: 158),GCG (SEQ ID NO: 157) or GTG (SEQ ID NO: 159).

In some embodiments, (i) the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:24; (ii) the crRNA comprises a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to SEQ ID NO: 69; (iii) the tracrRNA comprises a nucleobasesequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 121; and (iv) the targetnucleic acid has a PAM sequence of TCG (SEQ ID NO: 156). In someexamples, (i) the effector protein comprises an amino acid sequence thatis at least 75%, at least 80%, at least 85%, at least 90%, at least 95%,at least 98%, at least 99% or 100% identical to SEQ ID NO: 24; (ii) thesgRNA comprises a nucleobase sequence that is at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 98%, at least99% or 100% identical to SEQ ID NO: 150; and (iii) the target nucleicacid has a PAM sequence of TCG (SEQ ID NO: 156).

In some embodiments, (i) the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:25; (ii) the crRNA comprises a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to SEQ ID NO: 70; (iii) the tracrRNA comprises a nucleobasesequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 122; and (iv) the targetnucleic acid has a PAM sequence of ATTA (SEQ ID NO: 160), ATTG (SEQ IDNO: 161), GTTA (SEQ ID NO: 162), or GTTG (SEQ ID NO: 163). In someembodiments, (i) the effector protein comprises an amino acid sequencethat is at least 75%, at least 80%, at least 85%, at least 90%, at least95%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 25; (ii)the sgRNA comprises a nucleobase sequence that is at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 98%, at least99% or 100% identical to SEQ ID NO: 151; and (iii) the target nucleicacid has a PAM sequence of ATTA (SEQ ID NO: 160), ATTG (SEQ ID NO: 161),GTTA (SEQ ID NO: 162), or GTTG (SEQ ID NO: 163).

In some embodiments, (i) the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:26; (ii) the sgRNA comprises a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, atleast 99% or 100% identical to SEQ ID NO: 149; and (iii) the targetnucleic acid has PAM sequence of TCG (SEQ ID NO: 156).

In some embodiments, (i) the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:28; (ii) the crRNA comprises a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to SEQ ID NO: 73; (iii) the tracrRNA comprises a nucleobasesequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 121; and (iv) the targetnucleic acid has a PAM sequence of ATTA (SEQ ID NO: 160), ATTG (SEQ IDNO: 161), GTTA (SEQ ID NO: 162), or GTTG (SEQ ID NO: 163). In someembodiments, (i) the effector protein comprises an amino acid sequencethat is at least 75%, at least 80%, at least 85%, at least 90%, at least95%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 28; (ii)the sgRNA comprises a nucleobase sequence that is at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 98%, at least99% or 100% identical to SEQ ID NO: 150; and (iii) the target nucleicacid has a PAM sequence of ATTA (SEQ ID NO: 160), ATTG (SEQ ID NO: 161),GTTA (SEQ ID NO: 162), or GTTG (SEQ ID NO: 163).

In some embodiments, (i) the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:31; (ii) the sgRNA comprises a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, atleast 99% or 100% identical to SEQ ID NO: 151; and (iii) the targetnucleic acid has a PAM sequence of ATTA (SEQ ID NO: 160), ATTG (SEQ IDNO: 161), GTTA (SEQ ID NO: 162), or GTTG (SEQ ID NO: 163).

In some embodiments, (i) the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:32; (ii) the sgRNA comprises a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, atleast 99% or 100% identical to SEQ ID NO: 152; and (iii) the targetnucleic acid has a PAM sequence of TCG (SEQ ID NO: 156) or GCG (SEQ IDNO: 157).

In some embodiments, (i) the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:21; (ii) the crRNA comprises a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to SEQ ID NO: 66; (iii) the tracrRNA comprises a nucleobasesequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 118; and (iv) the targetnucleic acid has a PAM sequence of TC (SEQ ID NO: 164).

In some embodiments, (i) the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:29; (ii) the crRNA comprises a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to SEQ ID NO: 74; (iii) the tracrRNA comprises a nucleobasesequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 121; and (iv) the targetnucleic acid has a PAM sequence of ATTG (SEQ ID NO: 161), ACTG (SEQ IDNO: 165), GTTG (SEQ ID NO: 163), or GCTG (SEQ ID NO: 166).

In some embodiments, (i) the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:30; (ii) the crRNA comprises a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to SEQ ID NO: 75; (iii) the tracrRNA comprises a nucleobasesequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 124; and (iv) the targetnucleic acid has a PAM sequence of TCG (SEQ ID NO: 156).

In some embodiments, (i) the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:34; (ii) the crRNA comprises a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to SEQ ID NO: 79; (iii) the tracrRNA comprises a nucleobasesequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 127; and (iv) the targetnucleic acid has a PAM sequence of ATTA (SEQ ID NO: 160), ATTG (SEQ IDNO: 161), GTTA (SEQ ID NO: 162), or GTTG (SEQ ID NO: 163).

In some embodiments, (i) the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:44; (ii) the crRNA comprises a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to SEQ ID NO: 89; (iii) the tracrRNA comprises a nucleobasesequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 145; and (iv) the targetnucleic acid has a PAM sequence of TTC (SEQ ID NO: 167).

In some embodiments, (i) the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:45; (ii) the crRNA comprises a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to SEQ ID NO: 90; (iii) the tracrRNA comprises a nucleobasesequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 147; and (iv) the targetnucleic acid has a PAM sequence of TTT (SEQ ID NO: 168), or TTC (SEQ IDNO: 167). In some embodiments, (i) the effector protein comprises anamino acid sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, at least 98%, at least 99% or 100% identical toSEQ ID NO: 45; (ii) the crRNA comprises a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 90; (iii) the tracrRNA comprises anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to SEQ ID NO: 148; and (iv)the target nucleic acid has a PAM sequence of TTT (SEQ ID NO: 168), orTTC (SEQ ID NO: 167).

In some embodiments, (i) the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:18; (ii) the crRNA comprises a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to SEQ ID NO: 63; (iii) the tracrRNA comprises a nucleobasesequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 113; and (iv) the targetnucleic acid has a PAM sequence of CC (SEQ ID NO: 155).

In some embodiments, (i) the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:19; (ii) the crRNA comprises a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to SEQ ID NO: 64; (iii) the tracrRNA comprises a nucleobasesequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 114; and (iv) the targetnucleic acid has a PAM sequence of CC (SEQ ID NO: 155).

In some embodiments, (i) the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:43; (ii) the crRNA comprises a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to SEQ ID NO: 88; (iii) the tracrRNA comprises a nucleobasesequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 144; and (iv) the targetnucleic acid has a PAM sequence of TTC (SEQ ID NO: 167).

In some embodiments, the reporter nucleic acid comprises a fluorophore,a quencher, or a combination thereof, and wherein the detectingcomprises detecting a fluorescent signal. In some embodiments, themethod further comprises reverse transcribing the target nucleic acid,amplifying the target nucleic acid, in vitro transcribing the targetnucleic acid, or any combination thereof. In some embodiments, themethod further comprises reverse transcribing the target nucleic acidand/or amplifying the target nucleic acid before contacting the samplewith the composition. In some embodiments, the method further comprisesreverse transcribing the target nucleic acid and/or amplifying thetarget nucleic acid after contacting the sample with the composition. Insome embodiments, the amplifying comprises isothermal amplification. Insome examples, the target nucleic acid is from a pathogen. In someexamples, the pathogen is a virus. In some embodiments, the virus is aSARS-CoV-2 virus or a variant thereof, an influenza A virus, aninfluenza B virus, a human papillomavirus, a herpes simplex virus, or acombination thereof. In some embodiments, the pathogen is a bacterium.In some embodiments, the bacterium is Chlamydia trachomatis. In someembodiments, the target nucleic acid is an RNA. In some embodiments, thetarget nucleic acid is DNA.

Also provided herein is a method of modifying a target nucleic acid, themethod comprising contacting the target nucleic acid with any one of thecompositions provided herein, thereby modifying the target nucleic acid.In some embodiments, modifying the target nucleic acid comprisescleaving the target nucleic acid, deleting a nucleotide of the targetnucleic acid, inserting a nucleotide into the target nucleic acid,substituting a nucleotide of the target nucleic acid with a donornucleotide or an additional nucleotide, or any combination thereof. Insome embodiments, the method further comprises contacting the targetnucleic acid with a donor nucleic acid. In some embodiments, the targetnucleic acid comprises a mutation associated with a disease. In someembodiments, the disease is suspected to cause, at least in part, acancer, an inherited disorder, an ophthalmological disorder, or acombination thereof. In some embodiments, the disease is cancer, anophthalmological disease, a neurological disorder, a blood disorder, ora metabolic disorder. In some embodiments, the neurological disorder isDuchenne muscular dystrophy, myotonic dystrophy Type 1, or cysticfibrosis. In some embodiments, the neurological disorder is aneurodegenerative disease. In some embodiments, the target nucleic acidis encoded by a gene selected from TABLE 4. In some embodiments, thegene is PCSK9. In some embodiments, the gene is B2M, TRAC, or CIITA, orNGCG_B2M, or a combination thereof. In some embodiments, the gene isIRAC, B2M, PD1, or a combination thereof. In some embodiments, thecontacting occurs in vitro. In some embodiments, the contacting occursin vivo. In some embodiments, the contacting occurs ex vivo.

Also provided herein is a method of modifying a target nucleic acid, themethod comprising contacting any one of the systems disclosed hereinwith the target nucleic acid, thereby modifying the target nucleic acid.In some embodiments, modifying the target nucleic acid comprisescleaving the target nucleic acid, deleting a nucleotide of the targetnucleic acid, inserting a nucleotide into the target nucleic acid,substituting a nucleotide of the target nucleic acid with a donornucleotide or an additional nucleotide, or any combination thereof. Insome embodiments, the method further comprises contacting the targetnucleic acid with a donor nucleic acid. In some embodiments, the targetnucleic acid comprises a mutation associated with a disease. In someembodiments, the disease is suspected to cause, at least in part, acancer, an inherited disorder, an ophthalmological disorder, or acombination thereof. In some embodiments, the the disease is cancer, anophthalmological disease, a neurological disorder, a blood disorder, ora metabolic disorder. In some embodiments, the neurological disorder isDuchenne muscular dystrophy, myotonic dystrophy Type 1, or cysticfibrosis. In some embodiments, the neurological disorder is aneurodegenerative disease. In some embodiments, the target nucleic acidis encoded by a gene selected from TABLE 4. In some embodiments, thegene is PCSK9. In some embodiments, the gene is IRAC, B2M, PD1, or acombination thereof. In some embodiments, the contacting occurs invitro. In some embodiments, the contacting occurs in vivo. In someembodiments, the contacting occurs ex vivo.

Also provided herein is a cell comprising any one of the compositionsprovided herein. In some embodiments, the cell is a T cell. In someexamples, the T cell is a natural killer T cell (NKT). In someembodiments, the cell is an induced pluripotent stem cell (iPSC).

Also provided herein is a cell produced by any one of the methodsdisclosed herein. In some embodiments, the cell is a T cell. In someexamples, the T cell is a natural killer T cell (NKT). In someembodiments, the cell is an induced pluripotent stem cell (iPSC).

Also provided herein is a population of cells produced by any one of themethods disclosed herein. In some examples, the population of cellscomprises T cells. In some examples, the population of cells comprisesNKT cells. In some examples, the population of cells comprise iPSCs.

Also provided herein is a method of producing a protein, the methodcomprising, (i) contacting a cell comprising a target nucleic acid tothe composition of any one of claims 1-126, thereby editing the targetnucleic acid to produce a modified cell comprising a modified nucleicacid; and (ii) producing a protein from the cell that is encoded,transcriptionally affected, or translationally affected by the modifiednucleic acid. In some embodiments, the method further comprisescontacting the cell to a DNA donor template. In some embodiments, thecell is a cancer cell, an animal cell, an HEK293 cell, or an immunecell. In some embodiments, the cell is a Chinese hamster ovary cell. Insome embodiments, the method further comprises treating a disease.

Also provided herein are methods of editing a target nucleic acid in amammalian cell comprising contacting the mammalian cell with acomposition comprising an effector protein and a guide nucleic acid,wherein the effector protein comprises an amino acid sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 23. In someembodiments, the guide nucleic acid comprises a sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 186. In someembodiments, the guide nucleic acid comprises at least about 40, atleast about 50, at least about 60, or at least about 70 contiguousnucleotides that are at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 98%, at least 99% or 100% identical to SEQID NO: 186.

Also provided herein are mammalian cells or a population of mammaliancells produced by any of the methods described herein.

Also described herein are methods of editing a target nucleic acid in amammalian cell comprising contacting the mammalian cell with acomposition comprising a fusion protein, wherein the fusion proteincomprises a fusion partner protein and an effector protein, wherein theeffector protein comprises an amino acid sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, atleast 99% or 100% identical to SEQ ID NO: 729. In some embodiments, thefusion partner protein comprises a base editing enzyme. In someembodiments, the base editing enzyme comprises a deaminase or an enzymewith deaminase activity. In some embodiments, the fusion partner proteinis selected from the group consisting of: ABE8e, ABE8.20m, APOBEC3, andAncBE4Max. In some embodiments, the fusion partner protein comprises anamino acid sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, at least 98%, at least 99% or 100% identical toany one of SEQ ID NOs: 713, 714, 732 and 733. In some embodiments, themethod further comprises contacting the mammalian cells with a guideRNA, wherein the guide RNA comprises a sequence that is at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 98%, atleast 99% or 100% identical to any one of SEQ ID NOs: 715-727. In someembodiments, the target nucleic acid comprises B2M, TRAC, CIITA,NGCG_B2M, or any combination thereof.

Also disclosed herein are methods of modifying the expression of atarget nucleic acid comprising contacting the mammalian cell with acomposition comprising a fusion protein, wherein the fusion proteincomprises a fusion partner protein and an effector protein, wherein theeffector protein comprises an amino acid sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, atleast 99% or 100% identical to SEQ ID NO: 728. In some embodiments, thefusion partner protein comprises a transcriptional activator. In someembodiments, the fusion partner protein comprises an amino acid sequencethat is at least 75%, at least 80%, at least 85%, at least 90%, at least95%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 300. Insome embodiments, the method further comprises contacting the mammaliancells with a guide RNA, wherein the guide RNA comprises a sequence thatis at least 75%, at least 80%, at least 85%, at least 90%, at least 95%,at least 98%, at least 99% or 100% identical to any one of SEQ ID NOs:647-710. In some embodiments, the target nucleic acid comprises NEUROD1,HBG1, ASCL1, LIN28A, or any combination thereof.

Also disclosed herein are methods of modifying the expression of atarget nucleic acid comprising contacting the mammalian cell with acomposition comprising a fusion protein, wherein the fusion proteincomprises a fusion partner protein and an effector protein, wherein theeffector protein comprises an amino acid sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, atleast 99% or 100% identical to SEQ ID NO: 729. In some embodiments, thefusion partner protein comprises a transcriptional activator. In someembodiments, the fusion partner protein comprises an amino acid sequencethat is at least 75%, at least 80%, at least 85%, at least 90%, at least95%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 300. Insome embodiments, the method further comprises contacting the mammaliancells with a guide RNA, wherein the guide RNA comprises a sequence thatis at least 75%, at least 80%, at least 85%, at least 90%, at least 95%,at least 98%, at least 99% or 100% identical to any one of SEQ ID NOs:647-710. In some embodiments, the target nucleic acid comprises NEUROD1,HBG1, ASCL1, LIN28A, or any combination thereof.

Also disclosed herein are methods of modifying the expression of atarget nucleic acid comprising contacting the mammalian cell with acomposition comprising a fusion protein, wherein the fusion proteincomprises a fusion partner protein and an effector protein, wherein theeffector protein comprises an amino acid sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, atleast 99% or 100% identical to SEQ ID NO: 730. In some embodiments, thefusion partner protein comprises a transcriptional activator. In someembodiments, the fusion partner protein comprises an amino acid sequencethat is at least 75%, at least 80%, at least 85%, at least 90%, at least95%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 300. Insome embodiments, the method further comprises contacting the mammaliancells with a guide RNA, wherein the guide RNA comprises a sequence thatis at least 75%, at least 80%, at least 85%, at least 90%, at least 95%,at least 98%, at least 99% or 100% identical to any one of SEQ ID NOs:647-710. In some embodiments, the target nucleic acid comprises NEUROD1,HBG1, ASCL1, LIN28A, or any combination thereof.

Also disclosed herein are methods of modifying the expression of atarget nucleic acid comprising contacting the mammalian cell with acomposition comprising a fusion protein, wherein the fusion proteincomprises a fusion partner protein and an effector protein, wherein theeffector protein comprises an amino acid sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, atleast 99% or 100% identical to SEQ ID NO: 731. In some embodiments, thefusion partner protein comprises a transcriptional activator. In someembodiments, the fusion partner protein comprises an amino acid sequencethat is at least 75%, at least 80%, at least 85%, at least 90%, at least95%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 300. Insome embodiments, the method further comprises contacting the mammaliancells with a guide RNA, wherein the guide RNA comprises a sequence thatis at least 75%, at least 80%, at least 85%, at least 90%, at least 95%,at least 98%, at least 99% or 100% identical to any one of SEQ ID NOs:647-710. In some embodiments, the target nucleic acid comprises NEUROD1,HBG1, ASCL1, LIN28A, or any combination thereof.

Also disclosed herein are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to any one of SEQ ID NOs: 1-45, 202-293, or 728-731, andwherein the engineered guide nucleic acid comprises: (i) a crRNA orsgRNA comprising a nucleobase sequence that is at least 75%, at least80%, at least 85%, at least 90%, at least 95%, or 100% identical to anyone of the crRNA or sgRNA sequences of TABLE 13, (ii) a tracrRNAcomprising a nucleobase sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, or 100% identical to any one ofthe tracrRNA sequences of TABLE 13, or (iii) a combination thereof.

Also disclosed herein are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to any one of SEQ ID NOs: 1-45, 202-293, or 728-731, andwherein the engineered guide nucleic acid comprises: (i) a crRNA orsgRNA comprising a nucleobase sequence that is at least 75%, at least80%, at least 85%, at least 90%, at least 95%, or 100% identical to anyone of the crRNA or sgRNA sequences of TABLE 14, (ii) a tracrRNAcomprising a nucleobase sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, or 100% identical to any one ofthe tracrRNA sequences of TABLE 14, or (iii) a combination thereof.

Also disclosed herein are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to any one of SEQ ID NOs: 223, 224, or 214 and whereinthe engineered guide nucleic acid comprises: (i) a crRNA or sgRNAcomprising a nucleobase sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, or 100% identical to any one ofSEQ ID NOs: 463, 464, or 466, (ii) a tracrRNA comprising a nucleobasesequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 465, or (iii) acombination thereof.

Also disclosed herein are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to any one of SEQ ID NOs: 1-45, 202-293, or 728-731, andwherein the engineered guide nucleic acid comprises: a crRNA or sgRNAcomprising a nucleobase sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, or 100% identical to SEQ ID NO:180 or 467.

Also disclosed herein are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to any one of SEQ ID NOs: 1-45, 202-293, or 728-731, andwherein the engineered guide nucleic acid comprises: a crRNA or sgRNAcomprising a nucleobase sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, or 100% identical to any one ofthe crRNA or sgRNA sequences of SEQ ID NOs: 468-481.

Also disclosed herein are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 23, and wherein the engineered guidenucleic acid comprises: a crRNA or sgRNA comprising a nucleobasesequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to any one of the crRNA or sgRNAsequences of TABLE 18.

Also disclosed herein are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to any one of SEQ ID NOs: 1-45, 202-293, or 728-731, andwherein the engineered guide nucleic acid comprises: a crRNA or sgRNAcomprising a nucleobase sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, or 100% identical to any one ofthe crRNA or sgRNA sequences of TABLE 19.

Also disclosed herein are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to any one of SEQ ID NOs: 1-45, 202-293, or 728-731, andwherein the engineered guide nucleic acid comprises: (i) a crRNA orsgRNA comprising a nucleobase sequence that is at least 75%, at least80%, at least 85%, at least 90%, at least 95%, or 100% identical to anyone of the crRNA or sgRNA sequences of TABLE 20, (ii) a tracrRNAcomprising a nucleobase sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, or 100% identical to any one ofthe tracrRNA sequences of TABLE 20, or (iii) a combination thereof.

Also disclosed herein are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to any one of SEQ ID NOs: 232, 233, 240, or 227, andwherein the engineered guide nucleic acid comprises: a crRNA or sgRNAcomprising a nucleobase sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, or 100% identical to any one ofSEQ ID NOs: 612-615.

Also disclosed herein are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 228, and wherein the engineered guidenucleic acid comprises: a sgRNA comprising a nucleobase sequence that isat least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to SEQ ID NO: 616.

Also disclosed herein are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to any one of SEQ ID NO: 215, and wherein the engineeredguide nucleic acid comprises: (i) a crRNA or sgRNA comprising anucleobase sequence that is at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, or 100% identical to any one of SEQ ID NOs:617, 620 or 621, (ii) a tracrRNA comprising a nucleobase sequence thatis at least 75%, at least 80%, at least 85%, at least 90%, at least 95%,or 100% identical to any one of SEQ ID NOs: 618-619, or (iii) acombination thereof.

Also disclosed herein are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 23, and wherein the engineered guidenucleic acid comprises: (i) a crRNA or sgRNA comprising a nucleobasesequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to any one of SEQ ID NOs: 68 and 149,(ii) a tracrRNA comprising a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to SEQ ID NO: 120, or (iii) a combination thereof.

Also disclosed herein are compositions comprising an effector proteinand an engineered guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 23, and wherein the engineered guidenucleic acid comprises: (i) a crRNA or sgRNA comprising a nucleobasesequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to any one of the sgRNA sequences ofTABLE 25, (ii) a tracrRNA comprising a nucleobase sequence that is atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to any one of the tracrRNA sequences of TABLE 25, (iii) alinker sequence comprising a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to any one of the linker sequences of SEQ ID NO: 623, (iv) aspacer sequence comprising a nucleobase sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, or 100%identical to any one of the spacer sequences of TABLE 25, (v) a repeatsequence comprising a nucleobase sequence that is at least 75%, at least80%, at least 85%, at least 90%, at least 95%, or 100% identical to anyone of the repeat sequences of TABLE 25, or (vi) a combination thereof.

Also disclosed herein are methods of modifying a target nucleic acid ina sample, comprising contacting the sample with a composition disclosedherein thereby generating a modification of the target nucleic acid; andoptionally detecting the modification. In some embodiments, the targetnucleic acid comprises a PAM sequence selected from any one of the PAMsequences in TABLE 13, TABLE 14, TABLE 16, TABLE 17, TABLE 20, TABLE 21,TABLE 22, TABLE 23, or TABLE 24. In some embodiments, the target nucleicacid comprises a PAM sequence selected from any one of the PAM sequencesin TABLE 13. In some embodiments, the target nucleic acid comprises aPAM sequence selected from any one of the PAM sequences in TABLE 14. Insome embodiments, the target nucleic acid comprises a PAM sequenceselected from SEQ ID NOs: 368-371. In some embodiments, the targetnucleic acid comprises a PAM sequence selected from SEQ ID NOs: 369 and370. In some embodiments, the target nucleic acid comprises a PAMsequence selected from SEQ ID NOs: 304, 312, 313, 315, 324, and 335. Insome embodiments, the target nucleic acid comprises a PAM sequenceselected from SEQ ID NOs: 301, 318, 335, 343, 360, and 365. In someembodiments, the target nucleic acid comprises a PAM sequence is SEQ IDNO: 368. In some embodiments, the target nucleic acid comprises a PAMsequence is SEQ ID NO: 343. In some embodiments, the target nucleic acidcomprises a PAM sequence selected from SEQ ID NOs: 325-328

Also disclosed herein are systems for detecting or modifying a targetsequence of a target nucleic acid comprising: a) a polypeptide, or anucleic acid encoding the polypeptide; and b) an engineered guidenucleic acid, wherein the polypeptide comprises an amino acid sequencethat is at least 85% identical to SEQ ID NO: 23. In some embodiments,the polypeptide comprises an amino acid sequence that is at least 90%identical to SEQ ID NO: 23. In some embodiments, the polypeptidecomprises an amino acid sequence that is at least 95% identical to SEQID NO: 23. In some embodiments, the polypeptide comprises the sequenceof SEQ ID NO: 23. In some embodiments, the engineered guide nucleic acidcomprises a sequence that is at least 80% identical to a sequenceselected from: SEQ ID NOS: 624, 628, 630, 634, 638, 641, 643, and 645.In some embodiments, the engineered guide nucleic acid comprises asequence that is at least 95% identical to a sequence selected from: SEQID NOS: SEQ ID NOS: 624, 628, 630, 634, 638, 641, 643, and 645. In someembodiments, the polypeptide comprises a mutation that reduces anenzymatic activity of the polypeptide relative to the polypeptide thatis 100% identical to SEQ ID NO: 23. In some embodiments, the polypeptideis capable of binding to the target nucleic acid but has reduced or nonuclease activity on the target nucleic acid. In some embodiments, thesystem comprises a fusion partner protein fused to the polypeptide. Insome embodiments, the polypeptide is a nuclease that is capable ofcleaving at least one strand of a target nucleic acid. In someembodiments, the system comprises at least one of a detection reagentand an amplification reagent. In some embodiments, the detection reagentis selected from: a reporter nucleic acid, a detection moiety, anadditional polypeptide, and a combination thereof. In some embodiments,the one amplification reagent is selected from: a primer, a polymerase,a dNTP, an rNTP, and a combination thereof. In some embodiments, thetarget nucleic acid comprises a protospacer adjacent motif (PAM)selected from any one of SEQ ID NOS: 156-159, 325-328, and 369, andwherein the PAM is required for the polypeptide and engineered guidenucleic acid to detect or modify the target sequence. In someembodiments, the target nucleic acid comprises a PAM sequence of SEQ IDNO: 369. In some embodiments, the nucleic acid encoding the polypeptideis an expression vector. In some embodiments, the expression vectorcomprises or encodes the engineered guide nucleic acid. In someembodiments, the expression vector is an adeno-associated viral vector.In some embodiments, the nucleic acid encoding the polypeptide is amessenger RNA. In some embodiments, the system comprises a lipid orlipid nanoparticle.

Also disclosed herein are compositions comprising a polypeptide, or anucleic acid encoding the polypeptide, and an engineered guide nucleicacid, wherein the polypeptide comprises an amino acid sequence that isat least 90% identical to SEQ ID NO: 23. In some embodiments, thepolypeptide comprises an amino acid sequence that is at least 95%identical to SEQ ID NO: 23. In some embodiments, the engineered guidenucleic acid comprises a sequence that is at least 80% identical to asequence selected from: SEQ ID NOS: 624, 628, 630, 634, 638, 641, 643,and 645. In some embodiments, the engineered guide nucleic acidcomprises a sequence that is at least 95% identical to a sequenceselected from: SEQ ID NOS: 624, 628, 630, 634, 638, 641, 643, and 645.In some embodiments, the polypeptide is fused to at least one nuclearlocalization signal. In some embodiments, the polypeptide is capable ofbinding to a target nucleic acid but has reduced or no nuclease activityon the target nucleic acid. In some embodiments, the compositioncomprises a fusion partner protein fused to the polypeptide. In someembodiments, the polypeptide is a nuclease that is capable of cleavingat least one strand of a target nucleic acid. In some embodiments, thecomposition further comprises a target nucleic acid, and wherein thetarget nucleic acid comprises a PAM sequence selected from any one ofSEQ ID NOs: 156-159, 325-328, and 369. In some embodiments, thecomposition comprises a donor nucleic acid.

Also disclosed herein are compositions comprising an effector protein,or a nucleic acid encoding the effector protein, and an engineered guidenucleic acid, wherein the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to a sequenceselected from any one of SEQ ID NOs: 1-45, 202-293, and 728-731. In someembodiments, the engineered guide nucleic acid comprises a sequenceselected from: ID Nos: 624, 628, 630, 634, 638, 641, 643, 645, 646, and827-929. In some embodiments, the effector protein and engineered guidenucleic acid form a complex that recognizes a protospacer adjacent motifselected from TABLE 39. In some embodiments, the effector proteincomprises an amino acid sequence that is at least 95% identical to asequence selected from any one of SEQ ID NOs: 1-45, 202-240, and728-731. In some embodiments, the effector protein comprises an aminoacid sequence selected from SEQ ID NOS: 241-293. In some embodiments,the engineered guide nucleic acid is a single guide RNA. In someembodiments, the composition comprises a nuclear localization signallinked to the effector protein. In some embodiments, the length of theeffector protein is about 380 to about 500 linked amino acids. In someembodiments, a fusion partner protein fused to the effector protein. Insome embodiments, the effector protein is a nuclease that can cleave atleast one strand of a target nucleic acid. In some embodiments, theeffector protein is a nuclease that can cleave both strands of a doublestranded target nucleic acid. In some embodiments, the compositioncomprises at least one of a detection reagent and an amplificationreagent. In some embodiments, detection reagent is selected from: areporter nucleic acid, a detection moiety, an additional polypeptide,and a combination thereof. In some embodiments, the one amplificationreagent is selected from: a primer, a polymerase, a dNTP, an rNTP, and acombination thereof. In some embodiments, the nucleic acid encoding theeffector protein is an expression vector. In some embodiments, theexpression vector comprises or encodes the engineered guide nucleicacid. In some embodiments, the expression vector is an adeno-associatedviral vector. In some embodiments, the nucleic acid encoding theeffector protein is a messenger RNA. In some embodiments, thecomposition comprises a lipid or lipid nanoparticle. In someembodiments, the composition comprises a donor nucleic acid. In someembodiments, the engineered guide nucleic acid comprises a firstsequence, wherein the effector protein can bind the first sequence; anda second sequence that hybridizes to a target sequence of a targetnucleic acid. In some embodiments, the target sequence is a eukaryoticsequence

Also disclosed herein are compositions comprising an effector protein,or a nucleic acid encoding the effector protein, and an engineered guidenucleic acid, wherein the effector protein comprises an amino acidsequence that is at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO23. In some embodiments, the effector protein comprises an amino acidsequence that is at least 95% identical to SEQ ID NO: 23. In someembodiments, the effector protein comprises the sequence of SEQ ID NO:23. In some embodiments, the engineered guide nucleic acid comprises asequence that is at least 80% identical to a sequence selected from: SEQID NOS: 624, 628, 630, 634, 638, 641, 643, 645, 646, and 855-873. Insome embodiments, the engineered guide nucleic acid comprises a sequencethat is at least 95% identical to a sequence selected from: SEQ ID NOS:624, 628, 630, 634, 638, 641, 643, 645, and 855-873. In someembodiments, the engineered guide nucleic acid comprises a sequence thatis at least 80% identical to a sequence selected from: SEQ ID NOS: 645,932, 857, 933, 934, 935, 936, 737, 747, 750, 761, 763, 765, 769, 773,780, 782, or 785. In some embodiments, the engineered guide nucleic acidcomprises a sequence that is at least 95% identical to a sequenceselected from: SEQ ID NOS: 645, 932, 857, 933, 934, 935, 936, 737, 747,750, 761, 763, 765, 769, 773, 780, 782, or 785. In some embodiments, theengineered guide nucleic acid comprises a sequence that is at least 80%identical to SEQ ID NO: 645. In some embodiments, the engineered guidenucleic acid comprises a sequence that is at least 95% identical to SEQID NO: 645. In some embodiments, the effector protein and engineeredguide nucleic acid form a complex that recognizes a protospacer adjacentmotif selected from: TCG, and KYG. In some embodiments, the effectorprotein comprises a mutation that reduces an enzymatic activity of thepolypeptide relative to the polypeptide that is 100% identical to SEQ IDNO: 23. In some embodiments, the effector protein is capable of bindingto the target nucleic acid but has reduced or no nuclease activity onthe target nucleic acid. In some embodiments, the amino acid sequence ofthe effector protein comprises one or more amino acid alterations. Insome embodiments, the amino acid sequence of the effector proteincomprises one or more amino acid alterations in a domain selected from aREC domain and a RuvC domain. In some embodiments, the one or more aminoacid alterations are selected from: A110R, T111R, E112R, M113R, S114R,T115R, Q116R, S117R, L118R, S119R, F122R, A123R, T124R, E125R, L126R,E127R, T128R, N129R, I130R, F131R, A132R, K261R, V263R, V264R, G265R,V266R, D267R, D267A, D267N, L268R, G269R, I270R, N271R, V272R, P273R,A274R, Y275R, V276R, A277R, T278R, N279R, I280R, T281R, E282R, E363Q,I457R, A458R, N459R, S460R, K461R, D462R, I463R, I464R, K466R, N467R,E468R, and any combination thereof, relative to SEQ ID NO: 23. In someembodiments, the engineered guide nucleic acid is a single guide RNA. Insome embodiments, the composition comprises a nuclear localizationsignal linked to the effector protein. In some embodiments, the lengthof the effector protein is about 380 to about 500 linked amino acids. Insome embodiments, a fusion partner protein fused to the effectorprotein. In some embodiments, the effector protein is a nuclease thatcan cleave at least one strand of a target nucleic acid. In someembodiments, the effector protein is a nuclease that can cleave bothstrands of a double stranded target nucleic acid. In some embodiments,the composition comprises at least one of a detection reagent and anamplification reagent. In some embodiments, detection reagent isselected from: a reporter nucleic acid, a detection moiety, anadditional polypeptide, and a combination thereof. In some embodiments,the one amplification reagent is selected from: a primer, a polymerase,a dNTP, an rNTP, and a combination thereof. In some embodiments, thenucleic acid encoding the effector protein is an expression vector. Insome embodiments, the expression vector comprises or encodes theengineered guide nucleic acid. In some embodiments, the expressionvector is an adeno-associated viral vector. In some embodiments, thenucleic acid encoding the effector protein is a messenger RNA. In someembodiments, the composition comprises a lipid or lipid nanoparticle. Insome embodiments, the composition comprises a donor nucleic acid. Insome embodiments, the engineered guide nucleic acid comprises a firstsequence, wherein the effector protein can bind the first sequence; anda second sequence that hybridizes to a target sequence of a targetnucleic acid. In some embodiments, the target sequence is a eukaryoticsequence.

Also disclosed herein are systems or kits comprising one or morecomponents of any one of the compositions disclosed above, wherein theone or more components of the system are separate.

Also disclosed herein are pharmaceutical compositions, comprising thecomposition disclosed above and a pharmaceutically acceptable excipient.

Also disclosed herein are methods of modifying a target nucleic acid ina sample, comprising contacting the sample with a composition disclosedabove or the system disclosed above, thereby generating a modificationof the target nucleic acid; and optionally detecting the modification.

Also disclosed herein are methods of detecting a target nucleic acid ina sample, comprising the steps of: contacting the sample with: (i) thecomposition disclosed above or the system disclosed above; and (ii) areporter nucleic acid comprising a detectable moiety that produces adetectable signal in the presence of the target nucleic acid and thecomposition or system, and detecting the detectable signal.

In some embodiments, the method comprises contacting the target nucleicacid with a donor nucleic acid.

Also disclosed herein are cells comprising the compositions disclosedabove. Also disclosed herein are cells produced by methods disclosedabove. In some embodiments, the cell is a eukaryotic cell. In someembodiments, the cell is a mammalian cell. In some embodiments, the cellis a T cell, optionally wherein the T cell is a natural killer T cell(NKT). In some embodiments, the cell is an induced pluripotent stem cell(iPSC). Also disclosed herein are populations of cells.

Also disclosed herein are methods of treating or preventing a diseasecomprising administering to a subject in need thereof a composition, apharmaceutical composition or a cell disclosed above.

Also disclosed herein are compositions comprising an effector protein,or a nucleic acid encoding the effector protein, and a guide nucleicacid, or a nucleic acid encoding the guide nucleic acid, wherein theeffector protein comprises an amino acid sequence that is (a) at least50%, at least 60%, at least 70%, at least 75%, at least 80%, at least85%, at least 90%, at least 95%, at least 98%, at least 99% or 100%identical to SEQ ID NO: 23 and (b) includes an amino acid sequenceselected from the group: (a) an amino acid sequence that is at least60%, at least 70%, at least 80%, at least 90%, at least 95%, at least98%, at least 99% or 100% identical to SEQ ID NO: 793, (b) an amino acidsequence that is at least 60%, at least 70%, at least 80%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:794, (c) an amino acid sequence that is at least 60%, at least 70%, atleast 80%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 795, (d) an amino acid sequence that is atleast 60%, at least 70%, at least 80%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 796, (e) anamino acid sequence that is at least 60%, at least 70%, at least 80%, atleast 90%, at least 95%, at least 98%, at least 99% or 100% identical toSEQ ID NO: 797, (f) an amino acid sequence that is at least 60%, atleast 70%, at least 80%, at least 90%, at least 95%, at least 98%, atleast 99% or 100% identical to SEQ ID NO: 798, and (g) an amino acidsequence that is at least 60%, at least 70%, at least 80%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:799, and wherein the effector protein interacts with the guide nucleicacid to form a complex that is targeted to a target sequence via basepairing between the guide nucleic acid and the target sequence.

Also disclosed herein are composition comprising an effector protein, ora nucleic acid encoding the effector protein, and a guide nucleic acid,or a nucleic acid encoding the guide nucleic acid, wherein the effectorprotein comprises a sequence of amino acids that is at least 37%, atleast 40%, at least 50%, at least 60%, at least 70%, at least 80%, atleast 90%, at least 95%, at least 98%, at least 99% or 100% identical toSEQ ID NO: 796, and wherein the effector protein interacts with theguide nucleic acid to form a complex that is targeted to a targetsequence via base pairing between the guide nucleic acid and the targetsequence.

INCORPORATION BY REFERENCE

All publications, patents, and patent applications mentioned in thisspecification are herein incorporated by reference to the same extent asif each individual publication, patent, or patent application wasspecifically and individually indicated to be incorporated by reference.

BRIEF DESCRIPTION OF THE DRAWINGS

The novel features of the invention are set forth with particularity inthe appended claims. A better understanding of the features andadvantages of the present invention will be obtained by reference to thefollowing detailed description that sets forth illustrative embodiments,in which the principles of the invention are utilized, and theaccompanying drawings of which:

FIG. 1 illustrates PAM preferences for different D2S effector proteinsdisclosed herein. Frequency of nucleotides at each PAM position wasindependently calculated using a position frequency matrix (PFM) andplotted as a WebLogo. The number at the top of the plot corresponds tothe composition number of TABLE 2 and TABLE 3, denoting the D2S effectorprotein used, as well as the combination of crRNA, sgRNA, and/ortracrRNA sequence.

FIG. 2 shows that proteins described herein edit the genome of mammaliancells.

FIG. 3 shows that proteins described herein edit the genome of mammaliancells at multiple doses.

FIG. 4 show that proteins described herein, with a REC domainalteration, bind two genome loci of mammalian cells and edit the genomeat the locus with varying efficacy normalized to the wild-type. The xand y-axis of the plot corresponds to various targeted loci. Theidentifier next to each plotted data point denotes the amino acidresidue alteration and position in reference to SEQ ID NO: 23.

FIG. 5 show that proteins described herein, with a RuvC-I domainalteration, bind two genome loci of mammalian cells and edit the genomeat the locus with varying efficacy normalized to the wild-type. The xand y-axis of the plot corresponds to various targeted loci. Theidentifier next to each plotted data point denotes the amino acidresidue alteration and position in reference to SEQ ID NO: 23.

FIG. 6 show that proteins described herein, with a RuvC-II domainalteration, bind two genome loci of mammalian cells and edit the genomeat the locus with varying efficacy normalized to the wild-type. The xand y-axis of the plot corresponds to various targeted loci. Theidentifier next to each plotted data point denotes the amino acidresidue alteration and position in reference to SEQ ID NO: 23.

FIGS. 7A-7E illustrate PAM preferences for different D2S effectorproteins disclosed herein generated from in vitro enrichment (E. coliand mammalian) as described in Examples 5, 6, 12, and 13. Frequency ofnucleotides at each PAM position was independently calculated using aposition frequency matrix (PFM) and plotted as a WebLogo. The numbersand the bottom of each plot correspond to the D2S effector protein usedas well as the combination of crRNA, sgRNA, and/or tracrRNA sequences.

FIGS. 8A-8D illustrate change in gene expression of NEUROD1, HBG1,ASCL1, and L1N28A by different VPR-CasM fusions. FIG. 8A is the changein gene expression by CasM.286251 (D267A) with an N terminal VPR fusedby an XTEN10 linker. FIG. 8B is the change in gene expression byCasM.19952 (D267A) with an N terminal VPR fused by an XTEN10 linker.FIG. 8C is the change in gene expression by CasM.19952 (D267N) with an Nterminal VPR fused by an XTEN10 linker. FIG. 8D is the change in geneexpression by CasM.19952 (E363Q) with an N terminal VPR fused by anXTEN10 linker. The Y-axis shows the relative fold change of RNA levels.The X-axis shows the guide sequences tested. NT denotes a guide with theenzyme's repeat, but a scramble sequence spacer, gpool8 is a pooledcontrol the guides, and dCas9 is a catalytically inactive “dead” Cas9.

FIG. 9 illustrates the constructs used for base editing of differenttarget genes. The C and N term indicates the location of base editingeffector relative to the dCASM.19952 (D267A) protein. The CBE/ABEindicate the location of the effector. The XTEN is the linker used(e.g., XTEN10, XTEN40 or XTEN80). The tagBFP indicates a bluefluorescent protein and t2A indicates a self-cleaving peptide sequence.FIG. 9 at the bottom shows the indel percentage of catalytically activeCasM.19952 and gRNAs at respective target sites.

FIGS. 10A-10B illustrate a change in base call percent along the spacersequence for the CIITA t26 target. The upper X-axis shows the targetsequence along the spacer and the Y-axis shows the % change in base callper nucleotide. FIG. 10A shows the ABE8e-XTEN10-dCasM.19952(D267A)construct editing of CIITA t26. The editing appeared at position A9(about 0.94% of As were changed to Gs). FIG. 10B shows theAncBE4Max-XTEN10-dCasM.19952(D267A) construct editing of CIITA t26. Theediting appeared at positions C6 and C8 (about 0.70-0.75% of Cs werechanged to Ts). The editing at C18 is believed to have occurred fromexperimental noise.

FIG. 11A-11B show the conserved motifs that are shared by D2S effectorproteins. FIG. 11A shows weblogos of the multilevel consensus sequencesof the conserved motifs. Weblogos corresponding to MEME_1, MEME_2,MEME_3, MEME_4, MEME_5, MEME_6 and MEME_7 are shown to the right of the“MEME” descriptor. FIG. 11B shows the location of the detected motifs inthe D2S effector proteins.

FIG. 12 shows Sanger sequencing reads of target and non target strandsfrom CasM.19952 sgRNA complex and a target nucleic acid having a PAM ofGTCG.

DETAILED DESCRIPTION OF THE INVENTION

It is to be understood that both the foregoing general description andthe following detailed description are exemplary and explanatory onlyand are not restrictive of the disclosure. Herein, the use of thesingular includes the plural unless specifically stated otherwise.

The section headings used herein are for organizational purposes onlyand are not to be construed as limiting the subject matter described.All documents, or portions of documents, cited in this application,including, but not limited to, patents, patent applications, articles,books, and treatises, are hereby expressly incorporated by reference intheir entirety for any purpose.

II. Definitions

Unless otherwise indicated, all technical terms used herein have thesame meaning as commonly understood by one of ordinary skill in the artto which this invention belongs. Unless otherwise indicated or obviousfrom context, the following terms have the following meanings:

As used herein, the use of “or” means “and/or” unless stated otherwise.Furthermore, the use of the term “including” as well as other forms,such as “includes” and “included”, is not limiting.

“Percent identity,” “% identity,” and % “identical” refers to the extentto which two sequences (nucleotide or amino acid) have the same residueat the same positions in an alignment. For example, “an amino acidsequence is X % identical to SEQ ID NO: Y” can refer to % identity ofthe amino acid sequence to SEQ ID NO: Y and is elaborated as X % ofresidues in the amino acid sequence are identical to the residues ofsequence disclosed in SEQ ID NO: Y. Generally, computer programs can beemployed for such calculations. Illustrative programs that compare andalign pairs of sequences, include ALIGN (Myers and Miller, Comput ApplBiosci. 1988 March; 4(1):11-7), FASTA (Pearson and Lipman, Proc NatlAcad Sci USA. 1988 April; 85(8):2444-8; Pearson, Methods Enzymol. 1990;183:63-98) and gapped BLAST (Altschul et al., Nucleic Acids Res. 1997Sep. 1; 25(17):3389-40), BLASTP, BLASTN, or GCG (Devereux et al.,Nucleic Acids Res. 1984 Jan. 11; 12(1 Pt 1):387-95).

When comparing two protein sequences, it may be useful to not only lookat the percent identity between the aligned sequences, but also at theirpercent similarity. Certain amino acid substitutions are considered moreconservative than others; two amino acids may share characteristics suchas electrochemical properties. In these cases, substituting the aminoacid may not significantly affect the structure or function of theprotein. Therefore, the sequences' % identity may not accuratelydescribe their similarity. Additionally, protecting protein sequencessolely on identity runs the risk of other parties skilled in the artmaking conservative amino acid substitutions (e.g. changing everyleucine to an isoleucine) and still obtaining a functional protein. Insome instances, compositions and methods disclosed herein comprise aneffector protein, or a use thereof, that is substantially similar to aneffector protein sequence disclosed herein. Example 25 describes anexemplary method for calculating % similarity.

As used in the specification and claims, the singular forms “a”, “an”and “the” include plural references unless the context clearly dictatesotherwise. For example, the term “a sample” includes a plurality ofsamples, including mixtures thereof.

Unless specifically stated or obvious from context, as used herein, theterm “about” in reference to a number or range of numbers is understoodto mean the stated number and numbers +/−10% thereof, or 10% below thelower listed limit and 10% above the higher listed limit for the valueslisted for a range.

The term “alteration” as used herein can refer to the insertion,deletion, or substitution of an amino acid in an amino acid sequence ata position identified relative to a reference or parent sequence.

As used herein, the term “comprising” and its grammatical equivalentsspecifies the presence of stated features, integers, steps, operations,elements, and/or components, but do not preclude the presence oraddition of one or more other features, integers, steps, operations,elements, components, and/or groups thereof. As used herein, the term“and/or” includes any and all combinations of one or more of theassociated listed items.

The terms “determining,” “measuring,” “evaluating,” “assessing,”“assaying,” and “analyzing” are often used interchangeably herein torefer to forms of measurement. The terms include determining if anelement is present or not (for example, detection). These terms caninclude quantitative, qualitative or quantitative and qualitativedeterminations. Assessing can be relative or absolute. “Detecting thepresence of” can include determining the amount of something present inaddition to determining whether it is present or absent depending on thecontext.

As used herein, a “catalytically inactive effector protein” refers to aneffector protein that is modified relative to a naturally-occurringeffector protein to have a reduced or eliminated catalytic activityrelative to that of the naturally-occurring effector protein, butretains its ability to interact with a guide nucleic acid. The catalyticactivity that is reduced or eliminated is often a nuclease activity. Thenaturally-occurring effector protein may be a wildtype protein. Thecatalytically inactive effector protein can be referred to as acatalytically inactive variant of an effector protein, e.g., a Caseffector protein.

The term “in vivo” is used to describe an event that takes place in asubject's body. The term “ex vivo” is used to describe an event thattakes place outside of a subject's body. An ex vivo assay is notperformed on a subject. Rather, it is performed upon a sample separatefrom a subject. An example of an ex vivo assay performed on a sample isan “in vitro” assay. The term “in vitro” is used to describe an eventthat takes places contained in a container for holding laboratoryreagent such that it is separated from the biological source from whichthe material is obtained. In vitro assays can encompass cell-basedassays in which living or dead cells are employed. In vitro assays canalso encompass a cell-free assay in which no intact cells are employed.

As used herein, the terms “treatment” or “treating” are used inreference to a pharmaceutical or other intervention regimen forobtaining beneficial or desired results in the recipient. Beneficial ordesired results include but are not limited to a therapeutic benefitand/or a prophylactic benefit. A therapeutic benefit may refer toeradication or amelioration of symptoms or of an underlying disorderbeing treated. Also, a therapeutic benefit can be achieved with theeradication or amelioration of one or more of the physiological symptomsassociated with the underlying disorder such that an improvement isobserved in the subject, notwithstanding that the subject may still beafflicted with the underlying disorder. A prophylactic effect includesdelaying, preventing, or eliminating the appearance of a disease orcondition, delaying or eliminating the onset of symptoms of a disease orcondition, slowing, halting, or reversing the progression of a diseaseor condition, or any combination thereof. For prophylactic benefit, asubject at risk of developing a particular disease, or to a subjectreporting one or more of the physiological symptoms of a disease mayundergo treatment, even though a diagnosis of this disease may not havebeen made.

A “genetic disease”, as used herein, refers to a disease caused by oneor more mutations in the DNA of an organism. In some instances, adisease is referred to as a “disorder.” Mutations may be due to severaldifferent cellular mechanisms, including, but not limited to, an errorin DNA replication, recombination, or repair, or due to environmentalfactors. Mutations may be encoded in the sequence of a target nucleicacid from the germline of an organism. A genetic disease may comprise asingle mutation, multiple mutations, or a chromosomal aberration.

The term “variant” when used in reference to any amino acid or nucleicacid described herein refers to a sequence having a variation oralteration at an amino acid position or nucleic acid position ascompared to a parent sequence. The parent sequence can be, for example,an unmodified, wild-type sequence, a homolog thereof or a modifiedvariant of, for example, a wild-type sequence or homolog thereof.

III. Introduction

Disclosed herein are non-naturally occurring compositions and systemscomprising an effector protein (e.g., a D2S effector protein), which canbe referred to herein as an effector polypeptide, and an engineeredguide nucleic acid, which may simply be referred to herein as a guidenucleic acid. In general, an engineered effector protein and anengineered guide nucleic acid refer to an effector protein and a guidenucleic acid, respectively, that are not found in nature. In someinstances, systems and compositions comprise at least one non-naturallyoccurring component. For example, compositions and systems may comprisea guide nucleic acid, wherein the sequence of the guide nucleic acid isdifferent or modified from that of a naturally-occurring guide nucleicacid. In some instances, compositions and systems comprise at least twocomponents that do not naturally occur together. For example,compositions and systems may comprise a guide nucleic acid comprising arepeat region and a spacer region which do not naturally occur together.Also, by way of example, composition and systems may comprise a guidenucleic acid and an effector protein that do not naturally occurtogether. Conversely, and for clarity, a D2S effector protein or guidenucleic acid that is “natural,” “naturally-occurring,” or “found innature” includes D2S effector proteins and guide nucleic acids fromcells or organisms that have not been genetically modified by a human ormachine. The effector protein may be a Cas protein (i.e., an effectorprotein of a CRISPR-Cas system).

In some embodiments, an effector protein comprises a protein that iscapable of modifying a nucleic acid molecule (e.g., by cleavage,editing, deamination, methylation, demethylation, oxidation,acetylation, deacetylation, or recombination). Such modifications maymodulate the expression of the RNA and/or protein encoded by the nucleicacid molecule (e.g., increasing or decreasing the expression of anucleic acid molecule). In some embodiments, modifying a nucleic acidmolecule, such as a target nucleic acid molecule, comprises editing thenucleic acid molecule (e.g., deleting one or more nucleotides of thenucleic acid molecule, inserting one or more nucleotides into thenucleic acid molecule, mutating one or more nucleotides of the nucleicacid molecule), modulating the expression of the RNA and/or proteinencoded by the nucleic acid molecule (e.g., increasing or decreasing theexpression of a nucleic acid molecule, for example RNA), makingepigenetic modifications of the nucleic acid (e.g., methylation,demethylation, acetylation, or deacetylation), or any combinationthereof. Modifying can comprise the activity of the fusion partner of aneffector protein. For example, an effector protein comprising a fusionpartner can have the activity of increasing or decreasing the expressionof the RNA and/or the protein of a target nucleic acid.

In some embodiments, guide nucleic acid comprises a nucleic acidcomprising: a first nucleotide sequence that hybridizes to a targetnucleic acid; and a second nucleotide sequence that is capable of beingconnected to a programmable nuclease by, for example, beingnon-covalently bound by a programmable nuclease or hybridized to aseparate nucleic acid molecule that is bound by a programmable nuclease.The first sequence may be referred to herein as a spacer sequence. Thesecond sequence may be referred to herein as a repeat sequence. In someinstances, the first sequence is located 5′ of the second nucleotidesequence. In some instances, the first sequence is located 3′ of thesecond nucleotide sequence.

In some instances, the guide nucleic acid comprises a non-naturalnucleobase sequence. In some instances, the non-natural sequence is anucleobase sequence that is not found in nature. The non-naturalsequence may comprise a portion of a naturally-occurring sequence,wherein the portion of the naturally-occurring sequence is not presentin nature, absent the remainder of the naturally-occurring sequence. Insome instances, the guide nucleic acid comprises two naturally-occurringsequences arranged in an order or proximity that is not observed innature. In some instances, compositions and systems comprise aribonucleotide complex comprising an effector protein and a guidenucleic acid that do not occur together in nature. Engineered guidenucleic acids may comprise a first sequence and a second sequence thatdo not occur naturally together. For example, an engineered guidenucleic acid may comprise a sequence of a naturally-occurring repeatregion and a spacer region that is complementary to anaturally-occurring eukaryotic sequence. The engineered guide nucleicacid may comprise a sequence of a repeat region that occurs naturally inan organism and a spacer region that does not occur naturally in thatorganism. An engineered guide nucleic acid may comprise a first sequencethat occurs in a first organism and a second sequence that occurs in asecond organism, wherein the first organism and the second organism aredifferent. The guide nucleic acid may comprise a third sequence locatedat a 3′ or 5′ end of the guide nucleic acid, or between the first andsecond sequences of the guide nucleic acid. For example, an engineeredguide nucleic acid may comprise a naturally occurring CRISPR RNA (crRNA)and trans-activating crRNA (tracrRNA) coupled by a linker sequence.

In some embodiments, CRISPR RNA or crRNA is a type of guide nucleicacid, wherein the nucleic acid is RNA comprising a first sequence, oftenreferred to herein as a spacer sequence, that hybridizes to a targetsequence of a target nucleic acid, and a second sequence that is capableof being connected to an programmable nuclease by either a)hybridization to a portion of a tracrRNA or b) being non-covalentlybound by a programmable nuclease. In some embodiments, the crRNA iscovalently linked to an additional nucleic acid (e.g., a tracrRNA) thatis bound by the programmable nuclease. In some embodiments, the crRNAand a tracrRNA are in a dual guide system and are not linked by acovalent bond. In such a dual guide system, the crRNA can be connectedto the programmable nuclease by hybridization to a portion of thetracrRNA, and the tracrRNA includes a separate portion that is bound bythe programmable nuclease.

In some instances, compositions and systems described herein comprise anengineered effector protein that is similar to a naturally occurring D2Seffector protein. In some instances, the engineered effector proteinand/or a naturally occurring D2S effector protein is referred to as apolypeptide. The engineered effector protein may lack a portion of thenaturally occurring D2S effector protein. The D2S effector protein maycomprise a mutation relative to the naturally-occurring D2S effectorprotein, wherein the mutation is not found in nature. The D2S effectorprotein may also comprise at least one additional amino acid relative tothe naturally-occurring D2S effector protein.

For example, the D2S effector protein may comprise an addition of anuclear localization signal (NLS) relative to the natural occurring D2Seffector protein.

In certain embodiments, the nucleotide sequence encoding the effectorprotein is codon optimized (e.g., for expression in a eukaryotic cell)relative to the naturally occurring sequence.

In some instances, compositions and systems provided herein furthercomprise a modified host cell comprising one or more D2S effectorprotein, engineered guide nucleic acids, and/or nucleic acids encodingthe same.

IV. Effector Proteins

In some embodiments, an effector protein comprises a protein,polypeptide, or peptide that non-covalently binds to a guide nucleicacid to form a complex that contacts a target nucleic acid, wherein atleast a portion of the guide nucleic acid hybridizes to a targetsequence of the target nucleic acid. A complex between an effectorprotein and a guide nucleic acid can include multiple effector proteinsor a single effector protein. In some instances, the effector proteinmodifies the target nucleic acid when the complex contacts the targetnucleic acid. In some instances, the effector protein does not modifythe target nucleic acid, but it is fused to a fusion partner proteinthat modifies the target nucleic acid when the complex contacts thetarget nucleic acid. A non-limiting example of an effector proteinmodifying a target nucleic acid is cleaving of a phosphodiester bond ofthe target nucleic acid. Additional examples of modifications aneffector protein can make to target nucleic acids are described hereinand throughout.

An effector protein may be brought into proximity of a target nucleicacid in the presence of a guide nucleic acid when the guide nucleic acidincludes a nucleotide sequence that is complementary with a targetsequence in the target nucleic acid. The ability of an effector proteinto modify a target nucleic acid may be dependent upon the effectorprotein being bound to a guide nucleic acid and the guide nucleic acidbeing hybridized to a target nucleic acid. An effector protein may alsorecognize a protospacer adjacent motif (PAM) sequence present in thetarget nucleic acid, which may direct the modification activity of theeffector protein. One of skill in the art understands that the phrase,“an effector protein recognizes a PAM sequence,” may mean that theeffector protein when complexed with a guide nucleic acid, is capable ofbinding and optionally modifying a target nucleic acid. An effectorprotein may modify a nucleic acid by cis cleavage or trans cleavage. Themodification of the target nucleic acid generated by an effector proteinmay, as a non-limiting example, result in modulation of the expressionof the nucleic acid (e.g., increasing or decreasing expression of thenucleic acid) or modulation of the activity of a translation product ofthe target nucleic acid (e.g., inactivation of a protein binding to anRNA molecule or hybridization). An effector protein may be aCRISPR-associated (“Cas”) protein. An effector protein may function as asingle protein, including a single protein that is capable of binding toa guide nucleic acid and modifying a target nucleic acid. Alternatively,an effector protein may function as part of a multiprotein complex,including, for example, a complex having two or more effector proteins,including two or more of the same effector proteins (e.g., dimer ormultimer). An effector protein, when functioning in a multiproteincomplex, may have only one functional activity (e.g., binding to a guidenucleic acid), while other effector proteins present in the multiproteincomplex are capable of the other functional activity (e.g., modifying atarget nucleic acid). An effector protein may be a modified effectorprotein having reduced modification activity (e.g., a catalyticallydefective effector protein) or no modification activity (e.g., acatalytically inactive effector protein). Accordingly, an effectorprotein as used herein encompasses a modified or programmable nucleasethat does not have nuclease activity.

Provided herein, in certain embodiments, are compositions that compriseone or more D2S effector proteins. TABLE 1 provides illustrative aminoacid sequences of D2S effector proteins. In some instances, the aminoacid sequence of the D2S effector protein is at least 65%, at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 97%, or at least 98%, at least 99%, or 100% identical to any oneof SEQ ID NOs: 1-45, and 202-240. In some instances, the amino acidsequence of the D2S effector protein is at least 65%, at least 70%, atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 97%, or at least 98%, at least 99%, or 100% similar to any one ofSEQ ID NOs: 1-45, and 202-240.

TABLE 1 Exemplary Amino Acid Sequences of Effector Proteins SEQ ID NO:Effector Protein Name Effector Protein Amino Acid sequence 1MAKKGTNRKKMIVKVMKYELKYESGCADFNEMQNELWKLQRQTREVMNR CasM.298706TIQLCYHWSYVQADYCKQHGCARRDVKPCDVYETNATSLDGYIYQLFKDEYPNFLMANLIATLRKAHQKYDALLFDIQEGNSSIPSFKKDQPLIFSKEAIRLPECLSDKRQITLFCFSKPYKSAHPTLDKITFAVRARSASEKSIFDHIISGKYALGESQLVYEKKKWFFLLSYKFTPESVDVNPEKVLGVDLGVVNALCAGSVENPHDSLFIKGTEAIEQIRRLEARKRDLQKQARYPGDGRIGHGTKTRVSPVYQTRDAIARMQDTLNHRWSRALIDFACKKGYGTIQMEDLSGIKALESEKPYLKHWTYFDLQSKIIYKAEEKGIRVVKVNPKCTSRRCSACGYISKENRKNQVEFLCVNCGYHHNADYNAAQNLSIPQIDRLIEKQLKEQESEENEAGANPK 2MAKGTLSKVMKYELRYLDGCGDFQNMQKELWTLQRQSREILNRTIQIAYHW CasM.280604DYTDREQFKKTGQHLDIKAETGYKRLDGYIYDSLKEDVQNFASVNVNATIQKAWAKYKSSKIDVLRGDMSLPSYKSDQPLVLHAQSMKIFSSDDDDVLQVTLFSNAYKKACNYSNIRFIIGLHDATQRTIIKKVLSGDWGIGQSQIVYKRPKWFLYLTYNFSPEQHEVNPDKILGVDLGESIAIYASSIGEYGSLRIEGGEISAFAKQLEARKRSLQKQAAYCGKGRIGHGTKSRVSDVYKMEDKIANFRNTVNHRYSKMLIDYALKHMYGTIQMEDLSGIKKETGFPKFLQHWTYYDLQQKIEAKAKEHGINFIKVDPAFTSQRCSKCGNIDSENRPSQAVFCCKKCGYKTNADFNAS 3MNVTKVMRYQLIYQGGGGDFESLQNQLWEFQRQTRAILNKTIQTMYLATAN CasM.281060QEKFSEKALYHDLCAEYPDMISSTVNATLREATKKYRSSVREILAGRMSLPSYKRDHPILLHNQSVALKQGNQGSYFATISVFSRKYQQGTPGVKQPSFQLIAKDNTQRTILQRLLSGEYKLGQCQLIYIRPKWFLNVAYSFTPSEKALDQEKVLGVDLGCVYAIYASSYGNHGIFKISGDEITSFERKQAAIQNRAFKNDLTRIREIEERRKQKLEQARYCGEGRIGHGVKTRVAPAYQDEGKISRFRETINHRYSKALVDYAEKNGYGTIQMEDLSGIKSSTGFPKRLQHWTYFDLQQKIKYKAEEQGIKVVKIKPAYTSQRCSRCGHIDPANRKSQSEFKCIACGFSSNADYNASQNISMRNIEKIIQGK AN 4MAKGTITKVMKYELRYLGGFSDFHEMQKEVWQLQRQYREILNKTIQIALHW CasM.284933DYVSAQQFGESGTYLDIREETGYKTLDGYIYNCLKGAYSEMASANLNAAVQKAWKKYKNSKTQVLQGVMSLPSYKSDQPILIDKGNVKLSAEENNGRAVLTLFSRNYRDTRGLKGNVEFSVLLHDGTQKSIFRNLIDKTYALGQCQLVYERKKWFLLLTYSFTPAGHALDPEKILGVDLGECYALYASSCYAPGILKIEGGEIAEYALRLEKRKRSLQQQARYCGEGRIGHGTKTRVGVVYKAEDRIASFRETINHRYSKELVDYAVSNGYGTIQMEDLSAIQKDLGFPKRLRHWTYYDLQMKITNKAKEHGIAVVKIDPRYTSQRCSKCGHIDPANRPRQEEFCCTACGYACNADYNASQNIS IKGIEKIIQKMLSAKAD5 MSKGMLTKVMKYTLRYVGGCGDFHEMQSILWELQKQTRAVLNKTIQIAFEW CasM.287908DYRSREAFQETGEYLDVHAETGYKRLDGYIYNCLKNEYADFAGKNLNAAIQTAWKKYNQSKRDIQTGKMSLPSYRSNQPLIIHNDNVMISQDMQAAPSVRFTLLSLEYKKAHDLNTNPTFEVLINDGTQRAIFEKVRSGEYKLGQCMIQYDKKKWFLLLTYSFQPEKLTLDKNKILGVDLGETIVICASSVSERGRFVIDGGEITRFATQIEARKRSQQHQAAYCGEGRIGHGTKTRVDAVYKTEDRIANFRDTINHRYSRALVNYAVKHGFGTIQMEDLSGIKSSDDFPKFLRHWTYYDLQSKIESKAKERGIAVVKVNPRFTSRRCSKCGYIDEGNRKDQAHFCCLSCGFRANADFNASQNLSIK GIDKIIEKEYNANSKQT 6MGKPITKTMKYQIHYIDGCGDFHNMQKELWDLQRIVRQILNKTINESYLWFV CasM.288518RSEQYYRDTGENLSVEEQTGYKTLDGHIYNLLKQEYTQKLVSNSLNASIQAAYKKMKDSRRDVMIGTMSLPSYRSDQPIIIYNKNIKFSSHPEHGFVVDCSLFSDAYKKSQGYEKSVKFQVSVDDNTQRSIFENILTGNYKHGQCSIVYEKKKWFLLLTYSFVPEETKLDPDKILGVDVGVVYALYASSKGNHGTFKIKGDEAITFIQRVEARKHSRQLQGTYCGDGRIGHGTKTRVQPVYNERALISNFQDTINHRYSKALIDYAKKNGYGTIQMEDLSGIKEVQQYPKYLQHWTYYDLQLKIQYKAKEAGIGFVKVTPKYTSQRCSHCGNIDEANRPKQDVFRCTVCGYERNADYNASQNLSIKGIDRIIDDQLKQMNKANPKKTENA 7MSGGAITKVMKYDLTYKDGYGNFKDMQEAVWKLIRDTRTILNETIKIAYHW CasM.293891DYLNEKSKRETGEHLDLLEETGYKRLDGYIYDDLKDRFPDFASSNLNAAIQTAWKKYKQSQKDVYIGKMTLPSYKSDQPLPINKQSIKIYDEEREHIVELNLFSTKHKKEHGLASNVRFRINLHDNTQHAIYERVLSGEYTLGQCQLLYDRPKWFFILTYSFKPAQNKLDPDKILGVDMGETCALYASTFGEQGSFVINGGEVSEYAKREEARKRSLQKQAAVCGEGRIGHGTKTRVSSVYKEQERISNFRDTINHRYSKALIEYAVKNGCGTIQMEDLSGIRQSTDFPKFLRHWTYYDLQQKIKTKAKETGIAVSMIDPRYTSQRCSRCGHIDKANRKDQAHFHCLKCGYSCNADFNASQNISIRGI DKIIQKELGAKAKQTD 8MKEIAKVMKYQLIYLDGGGDFYELQQTLWDLQRQTREILNKTIQSMYLATAT CasM.294270NTAFEENALYHRFGAEYPMMAALNVNATLRTAKKRYTSTIKETLRGTMSLPSYKRDQPILLHNQTIHLALEDGQYSALFSVYSEKFQKAHEGVARPRFALMARDGTQRAILDRLLDGSYRLGQSQMTYEQKKWFLSLTYKFVPEVRELDKSKILGVDLGCVYAIYASSMQQKGIFKISGDEITEFEKRQAAMQNREPVSTLERVEQLEQRRWQKQQQARYCGEGRVGHGTGTRVAPAYRDADKIARFRDTINHRYSKALVEYAEKNGFGTIQMEDLSGIKEDTGFPKRLRHWTYFDLQTKIQYKAAERGITVVKIDPQYTSQRCSRCGYIDKANRASQEKFLCQSCGFEANADYNASQNISVE KIDKLIAKDKKKLART 9MGQVTKVMRYQLIYQDGGGDFYTVQQELWELQRQTREILNKTIQTMYLADA CasM.294491NKEKFDNAAERTLNRRFCVDHPDMYTKTVTATLRKAKAKYNASQKEILAGRMSLPSYKRDQPILLNPQGFKIEEESDSFFAAIAVFSDKYKNKHPDVDVKRLRFRLVVKDGTQRAIIRRVISGEYKLGRSQLLYSKKKWFLNVTYSFEPAEKKVDPDKILGVDLGCVYAIYASSFGSPGVFKISGDEVSSFERKQAAIQNRSPKSTLERVEKIEERHKQKQQQARYCGEGRIGHGTKTRIAPVYQDEDKIARFRDTVNHRYSKALIDYAEKNGYGTIQMEDLSGIKSATGFPKRLKHWTYYDLQTKIEYKAEERGIKVVKIDPRYTSQRCSRCGYIDSGNRKSQAEFCCMACGFSCNADYNASQNIS IGGIAKIIADKRKEADAK10 YLDIREETGYKTLDGYIYNCLKGAYSEMASANLNAAVQKAWKKYKNSKTQ CasM.295047VLQGVMSLPSYKSDQPILIDKGNVKLSAEENNGRAVLTLFSRNYRDTRGLKGNVEFSVLLHDGTQKSIFRNLIDKTYALGQCQLVYERKKWFLLLTYSFTPAGHALDPEKILGVDLGECYALYASSCYAPGILKIEGGEIAEYALRLEKRKRSLQQQARYCGEGRIGHGTKTRVGVVYKAEDRIASFRETINHRYSKELVDYAVSNGYGTIQMEDLSAIQKDLGFPKRLRHWTYYDLQMKITNKAKEHGIAVVKIDPRYTSQRCSKCGHIDPANRPRQEEFCCTACGYACNADYNASQNISIKGIEKIIQKMLS AKAD 11MAEKTIVKVMKFELRYIDGAGEFSEMQKHLWELQKQTREVLNKTIQMGYAL CasM.299588ECKRFAHHDKTGQWLDDKELTGSKYKAVADYINAELKEDYNIFYSDCRNSTVRKAYKKFKDAKNKIFSGEMSLPSYRSNQPIIIHNRNVIIRGNAESALVGLKVFSDGFKALHGFPAAVNFKLCVKDGTQRAIIENVISEIYKISESQLIYDNKKWFLILAYRFTQKKNDLNPDKILGVDLGVKFAVYASSIGEYGSFRIKGGEVTEFIKRLEKRKKSLQNQATVCGDGRIGHGTKTRVADVYKARDKISNFQDTINHRYSRAIVDYARKNGYGTIQLEKLDNSIEKKGDYSPVLVHWTYYDLRTKMEYKAAEYGIKVIAVEPKYTSQRCSKCGYISSENRKTQESFECIKCGYKCNADFNASQNLSVR DIDRIIDEYLGANPELT12 VVNVAKGALSKVMKFELSYLDGCGDFQNMQKELWTLQRQTREILNRTIQIA CasM.277328YHWDYTDREHFKKTGQHLDVKSETGYKRLDGYIYDELKETVQNFASVNVNATIQKAWAKYKSSKTDVLRGDMSLPSYKSDQPLVLHAQSIKLSEDKDGPVLQVTLFSNAHKKACDYSNVRFAFRLHDATQRAIFKNVLSGEYGLGQSQIVYKRPKWFLYLTYNFSPEQHGLDPDKILGVDLGESIALYASSLGDYGSLRIEGGEVTAFAKQLEARKRSLQKQAAHCGEGRVGHGTRARVSDVYKAEDKIANFRNTVNHRYSKKLIEYAIQNRYGTIQMEDLSGIKQDTGFPKFLQHWTYYDLQQKIEAKAKENGINFIKVDPSYTSQRCSKCGNIDSDNRPSQAVFCCTKCGFRANADFNASQNLSIPEIDKIIKKERGANTK 13MAKKGTNRKKMIVKVMKYELKYEKGCADFNEMQNELWKLQRQTREVMNR CasM.297894TVQLCYHWNYVQADYCKQHGCAHRDVKPCDVYETNATSLDGYIYQLFKDEYPNFLMANLIATLRKAHQKYDALLPDIQEGNSSIPSFKKDQPLIFSKEAIHLPECLSDKRQITLFCFSKPYKSAHPTLDKITFAVRAHSASEKSIFDNIINGKYALGTSQLVYEKKKWFFLLSYKFTPESVDVNPEKVLGVDLGVVNALCAGSVENPHDSLFIKGTEAIEQIRRLEARKRDLQKQARYPGDGRIGHGTKTRVSPVYQTRDAIARMQDTLNHRWSRALIDFACKKGYGTIQMEDLSGIKAMESEKPYLKHWTYFDLQSKIIYKAEEKGIRVVKVNPKCTSRRCSACGYISKENRKNQAEFLCVNCGYHHNADYNAAQNLSIPQIDRLIEKQLKEQESEESEAGANPK 14MTERHDNESSKIKAEVSLLNSSVPDFEKKRHVKVLKLHILKPAGDMKWDELG CasM.291449ALLRDARYRVFRLANLAISEAYLDFHKWRSGGNEQPKLKISQLNRNLRSMLEDEVTGKQTKMIKSDRYSKSGALPDSIVSPLSMYKLGGLTSKSKWSEVLRGKSSLPTFKLNMAIPVRCDKPGDRRIERTKNGDAEVELRICLQPYPRVIIATGRNSLGDGQRAILDRLLDNTKYSEQGYRQRCFEIKEDQRSGKWHLFVTYDFPAIEPAKNLSRERIVGVDLGAACPLYAAINTGHARLGWKHFSPLAARVRALQNQTIRRRRQILRGGKVSLSEDSARSGHGRKRKLKPISKLEGKIDRAYTTLNHQLSATVIKFAKDNGAGVVQMEDLKGLRETLTGTFLGERWRYEELQRFIRYKADEAGIEIRLVNPQYTSRRCSECGHIHKDFTREFRDKSREGNKSVRFLCPDCGFTADPDYNAARNLASLDIAAIIERQLEIQGLRKHDP 15MKEKSKTLVKVARLRILKPAGDMKWSELGEMLRTVRYRVFRLANLAVSEAY CasM.297599LGFHMYRTNRATEFKAETIGKLSRRLREMLIEEGVDEKDLSRYSQTGAVPDTVAGALGQYKIRGITSPTKWRQVVRGQAALPTFRNDMAIPIRCDKQYQRRLEKTEAGEIEVELMICRKPYPRIVLGTADLGPGQRAILERLLQNTDNSADGYRQRLFEAKQDTQTKKWWLYVTYDFPRLKEGKLNQEIVVGVDLGFSIPLYVALNIGHARLGRRHFQALGNRIRSLQRQVLARRRSIQRGGRVNISHSTARSGHGRKRKLLPTEKLRGRIEKSYSTLNHQLSASVIDFAKNHHAGTIQIEDLANLKEELAGTFIGARWRYHQLQQFLKYKAEEAGITLNQVNPRYTSRRCSECGFINIDFDRAFRDAGRTEGRVTKFLCPECGYEADPDYNAARNISILDIDKLIRVQCKKQGLTYDAH 16MPERPKTVNKVIWFQIHKPAGDMTWKELGNLLREARYRVFRLANLAVSEKY CasM.286588LSFHMWRTGQEYKSETIGKLNRRLREMLIEEGVEEESQKRFSATGALPDTVVSTLAKGKLAAITSKSKWKDVVNGKTSLPTFKLNMAIPVRCDKAEQRRLRRTESGDVELELMICKQPYPRVVLKTGKLKSGQRAILDRLVENNDNSKEGYSQRVFEIKQVENNDGSKEWRLYISYTFPKKAVEANADVAVGVDIGFSVPLVAAVNNGLERLGYNDFRALNERIRSLQRQVLVRRRSMQSGGRDYVSTPTARSGHGRKRKLLPIQTLRKRWDNAYTTLNHQLSHAVVSFAENHGAATIQIENVKSLKDELRGTFLGQRWRYFELQQFLKYKADEVGIELREVNARYTSRRCSECGYINMAFTRQARDKGRVDGKPMEFVCPECGYKAHPDYNAARNIAMLDIEQKMQVQCKQQG ITYADDSEVL 17MTWPELGNMLRTVRYRVFRLANLAVSEAYLGFHMFRTKRAEEFKAETMGK CasM.286910LSRRLREMLIEEGVDEKDLSRYSQTGAVPDTVAGALSQYKIRGITSPTKWRQIVRGQVALPTFRNTMSIPVRCDKLYQRRLEQGDSGEVEVELMICRNPYPRVVLGTGDLNPGQQAILERLLQNTDNSADGYRQRLFEIKEDVQTRKWWLYVTYDFPKTTGKLNPEIVVGVDLGFSIPLYVALNSGHARLGYLHFKALGERIKSLQKQVMARRRAIQRGGRVSISHSTARTGHGVKRKLQPTEKLRGRIEKSYSTLNHQLSASVIDFAKNHHAGVIQIEDLSGLKEQLTGTFIGARWRYHQLQQFLKYKAEEAGITLKQINPRYTSRRCSECGFINMDFDRAFRDAGRTYGKVTKFLCPECGYEADPDYNAARNIATLDIEKLIRVQCEKHGLKFDAH 18VGKEGKRNVKVMKIRILKPCDGMTWNELGQLLRDARYRVFRLANLTVSEAY CasM.292335LNFHLWRTGRSQEFKKQTIGQLNRQLRNILQQEKYDDEKLNRYSKTGALPDTVCSALWQYKLMAVMKKSKWSEVIRGKSSLPTFRNDMAIPVRCDKPEQKRIEKTEQGQVEAALQVCVQPYPRVILGTHTLGDGQDAILKRLLDNQNQAIGGYRQRSFEIKYDEQKRWWLFITYDFPATEVATDKTIAVGVDLGVSVPLYAAVNNGPARLGRREFGGLGRRIRDLRNQTDARRRSIQRSGREGQSDDTARAGHGRKRKLLPIHILEGRLDKAYTTLNHQMSAAVIKFAAEQGAGIIQIENLAGLQDELRGTFIGGRWRYRQLQDFLKYKTQEMGIELRQVNPKYTSRRCSKCGFIHKDFDRDYRNRHSENGKPAQFVCPNPDCKYESDPDYNAARNLATLDIEEQIRVQCQKQGLE YDSKKDKNAL 19MKEKSKTLVKVARLRILKPAGDMTWSELGEMLRTVRYRVFRLANLAVSEAY CasM.293576LGFHMFRTQRAAEFKAETMGKLSRRLREMLIEEGVDEKELNCYSLTGAVPDTVAGALHQYKIRGITSPTKWRQVVRGQAALPTFRNDMSIPIRCDKPYQRRLEKTEAGEVEVELMICRKPYPRIVLGTADVGPGQEVILERLLQNKDNSSDGYRQRLFEAKQDRQTGKWWLYVTYDFPRPEEGELNPEIVVGVDLGFSVPLYVAINNGYARLGRRHFQALGNRIRSLQRQVLARRRSIQRGGRVNISHDTARSGHGIKRKLLPTEKLRGRIEKSYSTLNHQLSASVIDFTKNHHAGTIQIEDLANLKEVLAGTFIGARWRYHQLQQFLKYKADEAGITLKEVNPRYTSRRCSECGFIHKDFDRAFRDSGRTDGKVARFVCPECGYGPVDPDYNAAKNISTLDIEKHIRVQCKKQGLEYEV H 20MKEKAKTLVKVARLRILKPAGDMTWPELGNMLRTVRYRVFRLANLAVSEA CasM.294537YLGFHMFRTKRAEEFKAETMGKLSRRLREMLIEEGVDEKDLSRYSQTGAVPDTVAGALSQYKIRGITSPTKWRQIVRGQVALPTFRNTMSIPVRCDKLYQRRLEQGDSGEVEVELMICRNPYPRVVLGTGDLNPGQQAILERLLQNTDNSADGYRQRLFEIKEDVQTRKWWLYVTYDFPKTTGKLNPEIVVGVDLGFSIPLYVALNSGHARLGYLHFKALGERIKSLQKQVMARRRAIQRGGRVSISHSTARTGHGVKRKLQPTEKLRGRIEKSYSTLNHQLSASVIDFAKNHHAGVIQIEDLSGLKEQLTGTFIGARWRYHQLQQFLKYKAEEAGITLKQINPRYTSRRCSECGFINMDFDRAFRDAGRTYGKVTKFLCPECGYEADPDYNAARNIATLDIEKLIRVQCEKHGLKFDA H 21MAKKAKTMFKVTNFRILKPAGDMTWKELGQLLRDARYRTFRMANLALSEA CasM.298538YLNFYLLKKGDLKEYKNVKIGQIAKRLRDMLIEEGVDEEVQNRFSPKVALPAYVYSALDQFKLRGLTSKSNWKKVLRGQASLPTFRLNMSVPIRCDKPEHRRLEKTENGNVEVDLMICRKPYPRVVLETLKLDGSSKAILDRLLENEDNSPGNYRQRCFEVKQNPRSNDWWLYVTYEMPVDKDKKLDPKVIVGVDLGFSVPLYVAINNGHARLGRRHFQALGKRIHNLQNQVLARRRSIQRGGQVNLSHSTSRSGHGRKRKLQPTEKLQQKINSAYSTLNHQLSSSVIDFANNHKAGTIQIEDLETLKEQLTGTYIGRQWRYYQLQQFIEYKAKENSITVKKINPKYTSRRCSMCGHIHADFDRTFRDRSSNKGFVTKFICPECNFEADPDYNAAKNISTLDIENKIKLQCKKQKIDY 22MPKITRKIELLFDRSGLSEEECKEKWRFIYQINDNLYRVANRLVNQLYLADEI CasM.19924DDILRLSDQEYIALRKKLANKKLDEATRISLEEQMSQVMKRVNERRSAILQRPQQSFAYSVVTDSDTEGLTAKILDVLKQDVLSHYKADTKEVLKGEKSISNYKKGMPIPFAFNDSLRLYKEDGFFYLKWYNGIRFLLNFGRDASNNQLIVERCLGISKDEISYKACSSSIQIKKKGNHSKIFLLLVVDVPVEQYAQKPNMVVGVDLGLNVPIYAASNSTLERKAIGSREAFLNQRGAFQRRFRALQRLQTTKGGRGRLHKLEPLERVREAERNWVRTQNHLFSREVINFAIDVGASTIQMEKLANFGRDAQGEVREDKKYVLRNWSYFELQNLIEYKAKRAGIKVKYINPAFTSQTCSECGQLGERDSIHFKCTNPDCPNCGKDIHADYNGARNIAKSKDYIK 23MPTITRKIELTLLTEGLSEEQRKEQWGLLYHINDNLYKAANNISSKLYLDDHV CasM.19952SSMVRMKHAEYLSLLKELARAEKQKTPDADAIAELRKKVAAAEKEMTDQEHAICKYATEMSTQSLSYRFATELETNIFAKILDCLKQGVFATFNSDARDVKRGERAIRNYKKGMPIPFAWDKSLRIEKDNKDFYLRWYNGLRFLFNFGKDRSNNRLIVERCLKMDADYDGEYKLCNSSIQIAKREGKTKLFLLLVVKIPQEHVELNKKVVVGVDLGINVPAYVATNITEERKAIGDREHFLNSRMAFQRRYKSLQRLRGTAGGKGRAKKLEPLERLRKAEHNWVHTQNHLFSREVVDFAVKSHAATIHMEDLSGFGKDNDGNADERKEFVLRNWSYYELQNMIAYKAAKYGIKVEKIHPAYTSKTCSWCGQLGFREGVTFICENPECKQCGEKVHADYNAARNIANSKDIIKKNE 24MPTITRKIELTLCTDGLSEEQRKEQWGLLYHINDNLYKAANNISSKLYLDDHV CasM.274559SSMVRMKHAEYLSLLKELARAEKQKTPDADAIAELKKKVAATEKEMTDQEHAICKYATEMSTQSLSYRFSTEFETKIFAKILDCLKQGVFATFNSDAKDVKRGERAIRNYKKGMPIPFAWTDSLRIKKDNKDFYLLWYNGLRFLFNFGKDRSNNRLIVERCLKMDADYDGEYKLCNSSIQIAKREGKVKLFLLLVVSIPKEHVELNKKVVVSVDLGINVPAYVATNITEERKAIGDREHFLNSRMAFQRRYKSLQRLKGTTGGKGRTKKLEPLERLRKAEHNWVHTQNHLFSREVVDFAVKTHAATIHMEDLSGFGKDNDGNADERKEFVLRNWSYYELQNMISYKAAKYGIKVEKIRPAYTSKTCSWCGQHGFREGVTFICENPACKQCGEKVHADYNAARNIANSKEIIKKNE 25MPTITRKIELTLCTEGLSDQERKDQWNLLYHINDNLYRAANNISSKLYLDDHV CasM.286251GSMVRLKHAEYLSLLRALEKAKKQKAPDEEVIAELSQQVATAEQEMDEQAKAICQYATEMSTQTLSYRFATELETNIFGQILTCLRQGVFSTFNSDARDVKRGERSIRTYKKGMPIPFPWNDSLRIGFEDGEFYLRWYNGLRFRFDFGKDRSNNCLIVQRCMKMDKDYEGDYKLCNSSIQMVKREGKPKFFLLLVVNIPQERVELNKNIVVGVDLGINAPAYVATNTTPERKQIGDREHFLNERMAFQRRFKSLQRLKGTTGGRGRAKKLEPLERLRKAEQNWVHTQNHLFSREVIDFAVKARAATIHMEDLSGFGKDNDGNADERKEFVLRNWSYYELQNMITYKAAKYGIKVEKIRPAYTSKTCSWCGHQGFREGITFICENPECKKFGEKEHADYNAARNIANSKEIIKNNE E 26MPTITRKIELTLLTEGLSEEQRKEQWGLLYHINDNLYKAANNISSKLYLDDHV CasM.288480STMVRMKHAEYLSLLRELARAEKQKKPDVDAIAELREKVTAAEKEMSDQERAICTYATEMSTQSLSYRFATEIETNIFAKILDCLKQGVFATFNSDARDVKRGERAIRNYKKGMPIPFAWDKSLRIEKDNKDFYLRWYNGLRFLFNFGKDRSNNRLIVERCLKMDADYDGEYKLCNSSIQIVKREGKVKLFLLLVVSIPQEHVELNKKIVVGVDLGINVPAYVATNITEERKAIGDREHFLNSRMAFQRRYKSLQRLKGTAGGKGRTKKLEPLERLRKAEHNWVHTQNHLFSREVVDFAVKSHAATIHMEDLSGFGKDNDGNADERKEFVLRNWSYYELQNMIAYKAAKYGIKVERIRPAYTSKTCSWCGQLGFREGVTFICENPECKQCGEKVHADYNAARNIANSKDIIKKNE 27MPTMTRKIELKLCTEGLSDEERKAQLGLLYHINDNLYKAANNISSKLYLDDH CasM.288668VSSMVRLKHAEYLSLLNEFEKAKKKGDEEQIVELSLRVAAAEKELTDQELAICKYATEMSTDTLAYRFANEIEINVFGQILACLKQGIHSTFKKDAADVKRGERAIRNFKKGMPIPFPWSKSIRIENEGSDFYLRWYNGLRFRFDFGKDRSNNRLIVSRCLNLDPDFEDEYKLSNSSLQMVKRDGRPKLFLLLVVNIPQENVELNKKIVVGVDLGINSPAYVATNITMERQRIGSRDTFLNARMAIQRRFQSLQKLQNTAGGRGRKKKLEPLERLKETERNWVRTQNHLFSRDVVQFAVKTRAATIHMEDLSGFGKDDDGNADEKKEFVLRNWSYYELQTMIKYKAAKYGIKVEKIRPAYTSRTCSWCGHEGDRKGETFICENPECEKYGKKENADYNAARNIANSTDIIK 28MPTITRKIELTLCTEGLSDEQRKEQWGLLYHINDNLYKAANNISSKLYLDEHV CasM.289206SSMVRMKHAEYLSLLKELARAEKQQTPDEGLIAELSRKLSAAEKEMADQELAICKYATEMSTQTLSYNFAKEIETNIFGQILTCLRQGVYATFNSDAKDVKRGERAIRNYKKGMPIPFPWNNSLKIESDSGEFYLRWYNGLRFLLTFGKDRSNNRMIVNRCMKMDEDFEGEYKLCNSSIQLAKRDGKPKLFLLLVVNIPQEHVKLNKKIVVGVDLGVNVPAYVATNITEERKAIGDREHFLNTRMAFQRRYKSLQRLKGTAGGKGRTKKLEPLERLRDAERNWVHTQNHLFSREVVNFAVQARAATIHMEDLSGFGKDKDGNADEKKEFVLRNWSFYELQNMIAYKSAKYGIKVVKIRPAYTSKTCSWCGQQGDRKSTTFICENPKCKHYGESIHADYNAARNIANSNDIVKENE 29MPKITRKIEMTLCTEGLSDEQRKEQWGLLYHINDNLYKAANNISTKLYLDEH CasM.290598VSSMVRMKHADYLSLLKELAKAEKKSPDEDLIAELREKLAAAEQEMTDQELAICKYATEMSTQTLAYKFATEIEINVFGQILACLKQAAQSNFKSDAKDVKRGERAIRNYKKGMPIPFPWNDNIRIDADGDEFYLRWYNGLRFHLTFGKDKSNNRMIVKRCLKMDKDFEGEYKLCNSSIQMVKRDGKPKLFLLLVVNIPQEHVELNKNVVVGVDLGVNVPAYVATNITEERKAIGEREHFLNTRMQIQRRYKSLQRLKATAGGKGRTKKLEPLERLRKAEHNWVHTQNHLFSREVVNFAVQTHAATIHMEDLSGFGKDDDGNADEQKEFVLRNWSFYELQNMIAYKAAKYGIKVEKVKPAYTSKTCSWCGQLGFRQGVTFICENPACKQCGEKVHADYNAARNIANSKDIIKKN E 30MPTITRKIELHLCTDGLTDEQQKAQRLLLYHINDNLYKAANNVSSKLYLDEH CasM.290816VSSMVRLKHDEYLSLSRELARAEKKHDDELTTELRGKLAAAEREMTDQELAICKYATEMSTQSLSYRLVTELETKIFAKILDCLKQGVYATFNSDARDVKRGERAIRNYKKGMPIPFAWNDSVRIEYDEKEKDFYLRWYNDIRFKFHFGRDRSNNRLIVSRCLKLDKDYEGDYQLCNSSIQIVKRDGSTKFFLLLVVKIPQEHVELNKRIVVGVDLGINYPAYVATNCTEERMYIGDREHFLNTRMQFQRRYKSLQKLKGTAGGKGRSKKLEPLERLRNAERNWVHTQNHLFSLKVVNFAVQTHAATIHLEDLSGFGKDDDGNADERKEFVLRNWSYYELQSMIEYKAKKYGIKVEKIRPAYTSQTCSWCGQRGFRQGVTFICENPECKKCGEKENADYNAARNIANSKDVIKDK NE 31TPFVLYFQNYSLSLRQHITLYSMPTITRKIELTLCTEGLSDQERKDQWNLLYHI CasM.295071NDNLYRAANNISSKLYLDDHVGSMVRLKHAEYLSLLRAMEKAKKQKAPDEEVIAELSQQVAAAEQEMDEQAKAICQYATEMSTQTLSYRFATELETNIFGQILTCLRQGVFSTFNSDARDVKRGERSIRTYKKGMPIPFPWNDSLRIGFEDGEFYLRWYNGLRFRFDFGKDRSNNRLIVQRCMKMDKDYEGDYKLCNSSIQMVKREGKPKFFLLLVVNIPQERVELNKNIVVGVDLGINAPAYVATNTTPERKQIGDREHFLNERMAFQRRFKSLQRLKGTTGGRGRAKKLEPLERLRKAEQNWVHTQNHLFSREVIDFAVKARAATIHMEDLSGFGKDRDGNADERKEFVLRNWSYYELQNMITYKAAKYGIKVEKIRPAYTSKTCSWCGHQGFREGITFICENPECKKFGEKEHADYNAARNIANSKEIIKNNEE 32MPTITRKIELHLCTEELSDEQQKAQRLLLYHINDNLYKAANNVSSKLYLDEHV CasM.295231SSMVRLKHDEYLSLLRELARAEKKADDELATQLREKLVAAEREMTDQELAICKYATEMSTQSLSYRFVTELETKIFAKILDCLKQGVYATFNSDSRDVKRGERAIRNYKKGMPIPFAWDKSVRIEYEEKEKDFFLRWYNDIRFKFHFGRDRSNNRLIVSRCMKLDKDYEGDYQLCNSSIQIVKRDGSTKYFLLLVVKIPQEHVELNKKIVVGVDLGINYPAFAATNCTEERMSIGDREHFLNTRMQFQRRFKSLQRLKGTTGGKGRNKKLEPLERLRKAEHNWVHTQNHLFSLKVVNFAVQAHAATIHLEDLSGFGKDDDGNADERKEFVLRNWSYYELQNMIKYKAKKFGIQVEKIRPAYTSQTCSWCGQRGFRQGITFICENPECKKCGEKENADYNAARNIANSKDIIKDKDE 33MPIITRKIELHISKEGLSAEDYKAQWQYLRQINDNLYMAANRVSSHCFLNDEY CasM.292139KYRLCLQIPDYIDIEKQLKDSKRARLSKEELGQLKKRKKELENTVKGRFQDEFEKNSLYTIISNEFGEIIPGQILTCLRQCVQSKYNRAKEELEKGERAISTYKKGMPIPFPINKSIRLQKQGEDFVLKWYNKIVFKLHFGRDRSNNRVIVERLIQSALNDKQKGEDYVMNNSSIQLVEKDKMTKIFLLLSMDIPTQKRKLDSELVLGVDLGLNFPLYYATNQSANIHDHIGDKDIFLKERMVFQRRFKELQRLQCTQGGRGRKKKLEPLEKLRDKERNWVRTKNHIFSREVIKVALHLGAGTIHLENLHNFGKDGNGELKNSKKFVFRNWSYFELQSMIEYKAKMEGITVKYVNPAYTSQTCSVCGMIGERKEQAVFRCMNSSCLEYGKEVNADFNAARNIAKAKM 34MPTITRKIELTLCTDGLSDDLRKDQWQLLYHINDNLYKAANNISSKLYLDEHV CasM.279423ASMVRLKHAEYLGLIKELAKARKRADDEAVRDLCSKLAVAEQEMNEQAKAICDYATEMSTQTLSYNFAKEIETNIFGQILTCLRQGVLLNFNSDARDVKRGERAIRNYKKGMPIPFPWNDTIKIVSEGDEFYLRWFSGLRFHLNFGKDRSNNRMIVRRCLKMEQDFDEEYKISNSSIQVAKRDGKQKLFLLLVVQIPQEQVVLNKKIVVGVDLGVNVPAYVATNITEERKAIGDREHFLNTRMQFQRRYKSLQRLKTTEGGRGRAKKLEPLERLRKAEHNWVHTQNHLFSREVVNFALQTQAATINMEDLSGFGKDNDGNADECKEFVLRNWSYYELQNMIVYKASKYGIRVQKIRPAYTSKTCSWCGHMGFREGVTFICENPDCKQFGEKVHADYNAARNIANSKEIIKNDE 35MSKTVTKTVKIALICEHTNKYGEKVDYKDINKLLWKLQKQTRELKNKTIQLC CasM.20054WEYNNFSCDYYKEHHEYPNMEDILKYKRINGFVENKLKTVNDLYSSNCSTTILSTCNEFQNYRSEFLKGTRSINSYKSDQPLDLHKGAIKLEHDGKDFYVSLKLLKRSAFNAMEFKGSDIRFKLNVKDKDKSTLKILESCYDKIYSISASKMTYDRKAGKWFLLLAYSFTPAKTENLDPEKILGVDLGIKIPICASVYGDLDRLTIEGGKIEEFRRRVEARKRSLQKQGKQCGDGRIGHGTKKRIKPITDIGDKIARFRDTENHIYSRYLIEYAVKKGCGTIQMEKLEGITREKDIFLKNWTYFDLQKKIEYKAKEKGIKVVYIEPAYTSKRCSSCGFIDTDNRLDQAHFKCLKCGFNENADYNASQNIGIKNIDKIIKEEHKSASDKLTSE 36VIILTKVVKLYLISEQINKEGQKIDYQRINSILWDLQKQTRDIKNRTVQLCWE CasM.282673WMNFSSDYCKTQEEYPKERDILGYTLEGYVYDYFKTGYDLYTGNISTSSREVCSSFKNVKKEILKGERSILSYKANQPLDLHKKAISLEYDNFNFFVKLKLLNRTGKKKYDITEDINFKIQVNDKSTRTILERCYDKEYKISGSKLIYEKKKKLWRLNLCYSFENSQVETLEKDKILGIDLGIVYPLMASIYGEYDRFSIKGGEIEEFRRRTEARKRSILQQTKYCGDGRIGHGRNKRTQPAYKINDKIARFRDTANHKYSRALIEYAVKKNCGIIQMENLTGISDNTDCFLKDWSYYDLQTKIENKAKEMGIKVVYIKAQYTSQRCSRCGYIDVNNRIRQALFKCQNCGYETNADYNASQNIGMYDIEN IIEETLKIQSANVKQS 37MTKVTKVYLISEQIDKDGNKIDFKKISELLWNLQMQTRDIKNKCVQLCWEWL CasM.282952NFSSDYYKKSEEYPKEKDTLGYTLSGFVYDRIKNGSDLYSSNLSTSSRDTCTAFSNYKKEMLKGERSVLSFKANQPLDIHNKAIKLSYENGNFFVALKMLNRAGKEKYGIKDDLRFRMQVRDKSVRTILERLMNDEYKVSASKLMYDKKKKLWKLNLCYSFDNHVISTLDTEKIMGVDLGVVYPIMASVNGDYARFSIKGGEIEAFRSRVEARRRSLLNQSRYCGDGRIGHGRKKRTEPATQIADKIARFRDTTNHKYSRALIDYAIKNGCGTIQMEKLTGITSSAEHFLKEWSYFDLQTKIESKAKEAGIKVVYINPKFTSQRCNKCGYIHTDNRPVQARFCCQKCGYEENADYNASQNIGTKHI DVIIEETLKMQCEPETPTE38 MNKVVKLALICEQSDKDNSPVDYKKINEILWELQKQTREIKNKAIQYCWEYN CasM.283262NFSSDYYKKFNEYPKEKDILSYTLVGFVNDKFKTGNDLYSGNCSTTVRNACTEFKNSKKELIKGSRSIINYRSNQPLDIHNKCIRIEFENNCFYTYLKLLNRPAFKKYNFANTEIKFKILVRDNSTKTILERCISNEYEIAASKLLYDQKKKCWFLNLVYAFEIKSNNSLDPNKILGVDLGIHYPICASVYGSLDRFTIDGGEIDEFRRRVESRKISMLKQGKNCGDGRIGHGIKARNKPVYNIEDKIARFRDTANHKYSRALIEYAVKHTCGTIQMEDLTGITDIANRFLKNWSYYDLQTKIEYKAKEAGINIVYIDPKNTSRRCSKCGYIDKENRETQSRFICLKCGFKENADYNASQNIGIKDIDKLIKEDVH 39VTLLVKVVKIYLISEQFDKAGNQIDYKEVNKILWELQKQTREAKNKTVQLLW CasM.284833EWNNFSSDYVKASGIYPKAKDIFGYSSVHGQANKELRTKLALNSSNLSTTTMDVCKIFNTYKKEVWEGKRSVPSYKSDQPLDLHKESIKLIYENNEFYVRLALLKKAEFAKYGFKDGFRFKMQVKDNSTKTILERCFDEVYKINASKLLYDQKKKKWKLNLSYSFDNKNISELDKEKILGVDVGVNCPLVASVFGDRDRFIIKGGEIEKFRKSVEARRRSMLEQTKYCGDGRIGHGRKKRTEPALNIGDKIARFRDTTNHKYSRALIEYAVKKGCGTIQMEKLTGITSKSDRFLKDWTYYDLQTKIENKAKEVGINVVYIAPKYTSQRCSKCGYIHKDNRPNQAKFRCLECDFESNADYNASQNIGIKNIDKIIEKDLQKQESEVQVNENK 40MNKVVKLALICEQSDKNNSPVDYKKVNEILWELQKQTREIKNKTIQYCWEY CasM.287700YNFSSDYYKKFNKYPKEKDILSYTLWGFINDKFKTGNDLYSGNCSATTKKVIKEFKNSKKELIRGSRSIINYKSNQPLNIHNKCIHLQFKNNNFYVSINLLNRRSFKKYNFANTAIKFKILVRDNSTKAILERCISNEYKISESQLIYNKKKKCWFLNLSYAFEIKSNNSLDPNKILGVDLGIHYPICASVYGSLDRFTIDGGEIDEFRRRVESRKISMLKQGKNCGDGRIGHGIKARNKPVYNIEDKIARFRDTANHKYSRALIEYAVKNNCGTIQMEDLTGITDNANRFLKNWSYYDLQTKIEYKAKEASINVVYINPENTSRRCSKCGYIDKENRKTQSSFICLKCGFKENADYNASQNISIKDIDKLIKED VH 41VTLLVKVVKIHLISEQFDKAGNRIDYEEVNKILWELQKQTREAKNKTVQLLW CasM.291507EWNNFSSDYVKASGIYPKAKDIFGYSSVHGQANKELRTKLALNSSNLSTTTMDVCKNFNTYKKEVWKGKRSVPSYKSDQPLDLHKDSIKLIYENNQFYVRLALLKKAEFAKYGFKDGFHFKMQVKDNSTKTILERCFDEVYKINASKLLYDQKKKKWKLNLSYSFDNKNISELDKEKILGVDVGVSYPLVASVFGDRDRFKIKGGEIEKFRKSVEARRRSMLEQTKYCGDGRIGHGRKKRTEPALNIGDKIARFRDTTNHKYSRALIEYAVKKGCGTIQMEKLTGITSKADRFLKDWTYYDLQTKIENKAKEVGINVVYIAPKYTSQRCSKCGYIHKDNRPNQAKFRCLECDFESNADYNASQNIGIKNIDKIIEKDLQKQESEVQVNENK 42LIWKDALGGIILTKIVKLYLISEQIDKDGNRVDYKEINSILWNLQKQTRDIKNK CasM.293410TVQLCWEWMNFSSDYYKKNELYPNEKEILNLTLRGYAYDHFKQGYDLYSSNISVLTEAVCGAFKNAKKEMLNGEKSVLSYKAEQPLDIHKKCIKLEYDKNFYVKLKMLNKAGKKKYGIEDDLNFKIQVEDKSTRTILERCIDGEYVVSGSKLIYDKKKKLWKLNLCYSFKANEIESLDKNKILGIDLGIACPLMASVNGEFDRFSIKGGEIETFRKRIEARKRSVLHQTKYCGDGRIGHGRNKRTEPAYKINDKIARFRDTANHKYSRALIDYAIRKNCGMIQMENLTGISDKKEHFLKEWSYYDLQTKIENKAKEKGIKIVYINPEYTSQRCSKCGYIDANNRELRAVFKCQKCGFEADADYNASQNIGIKNIEDIIENTLKISSANEKQTKNT 43VFYSTFLCYILTKYIDFSANECYNINTSSEVKQLMNKVVKLALICEQSDKDNSP CasM.295105VDYKKINEILWELQKQTREIKNKAIQYCWEYNNFSSDYYKKFNEYPKEKDILSYTLVGFVNDKFKTGNDLYSGNCSTTVRNACTEFKNSKKELIKGSRSIINYRSNQPLDIHNKCIRIEFENNCFYTYLKLLNRPAFKKYNFANTEIKFKILVRDNSTKTILERCISNEYEIAASKLLYDQKKKCWFLNLVYAFEIKSNNSLDPNKILGVDLGIHYPICASVYGSLDRFTIDGGEIDEFRRRVESRKISMLKQGKNCGDGRIGHGIKARNKPVYNIEDKIARFRDTANHKYSRALIEYAVKHTCGTIQMEDLTGITDIANRFLKNWSYYDLQTKIEYKAKEAGINIVYIDPKNTSRRCSKCGYIDKENRETQSRFICLKCGFKENADYNASQNIGIKDIDKLIKEDVH 44LISEQIDKDGNRVDYKEINSILWNLQKQTRDIKNKTVQLCWEWMNFSSDYYK CasM.295187KNELYPNEKEILNLTLRGYAYDHFKQGYDLYSSNISVLTEAVCGAFKNAKKEMLNGEKSVLSYKAEQPLDIHKKCIKLEYDKNFYVKLKMLNKAGKKKYGIEDDLNFKIQVEDKSTRTILERCIDGEYVVSGSKLIYDKKKKLWKLNLCYSFKANEIESLDKNKILGIDLGIACPLMASVNGEFDRFSIKGGEIETFRKRIEARKRSVLHQTKYCGDGRIGHGRNKRTEPAYKINDKIARFRDTANHKYSRALIDYAIRKNCGMIQMENLTGISDNKEHFLKEWSYYDLQTKIENKAKEKGIKIVYINPEYTSQRCSKCGYIDANNRELRAVFKCQNCGFEADADYNASQNIGIKNIEDIIENTLKISSA NEKQTKNT 45LVKVVKIYLISEQVDEQGKDVDYNTICGVLWDLQWETREIKNKTVQLCWEW CasM.295929SGFSSDYYKKYGEYPKEKNLLDYTMGGFVYDKLKSKYHLYTANLSTTSQNTCGIFRTYKVDFVKGNRSVLSFKADQPLDVHKKSISIDRIDDNYFVKLKLLNKSGIQKYGIRDDFHFRMLVKDNSTKTILERCVGGDYKAAASKIIYDKKKKMWCLNLSYEFDVNTAKDLNKNRILGIDIGIVYPVVASVNGELDRFVIQGGEIETFRRRVENRKKSLLKQTKYCGDGRIGHGRNKRTEPVDIISDQIARFRNTANHKYSRAVIDYAVRKQCGTIQMENLKGITDKSDRFLKNWSYYDLQQKIEYKAKEKGINVVFINPKYTSQRCSRCGYIDSANRPKLPNQSKFLCIKCGFTENADYNASQNIALY NIEKLIDAEA 202LHETEKSLKFAEKYIAMPTITRKIELTLCTEGLSDEQRKEQWGLLYHINDNLY CasM.19498KAANNISSKLYLDEHVSSMVRMKHAEYLSLQKELARAEKQKVDDAIIVELTRKLAVAEKEMTDQELAICKYATEMSTNTLAYNFAKEIETKIFGQILACLENNAHALFVDDSPNVRRGERAIRNYKKGMPIPFPWNRSIKIEADGGEFYLRWYNGLRFLLTFGKDRSNNRLIVKRCMKMDEVFEGEYKLCNSSIQLAKRDGKPKLFLLLVVNIPQEHVELNKNIVVGVDLGVNVPAYVATNITEERKAIGDREHFLNTRMTFQKRYKSLQRLKGTAGGKGRTKKLEPLERLRDAERNWVHTQNHLFSREVVNFAVQARAATINIEDLSGFGKDNDGNADEKKEFVLRNWSYYELQNMITYKASKYGIKVEKIRPDYTSKTCSWCGQQGFREGVTFICENPECKQHGEKIHADYNA ARNIANSKDIIKKNE 203MAETKRLQKVAKFQIVKPVNMSWDELGRMLRDVRYRLSRLANMAVSETYQ CasM.19548NLHQRYRLKNQDAPKSLKIGQLSRNLRKILREEGVEEENLSKYSKTCVLPDTITGAFSRYKLSSIDWRKVLTGKISVPNYKTNLSIPIRCDKPHQRRLELTETGEIEADLMICNKPYPRVLLSTRTISDGQRTVLERLVSNKTNFLPGYRHRFFEVKEKKGKWELSVTYDFPKAEATRLHPDIIVGVDLGWSVPLYAAINNGYARIGYRKFEPLAKRIKHLQKQIKGRRFSTQKGGVKDLAQPTARAGHGRKRILKPIEKLEYKIDNAYTTLNHQLSHCVVEFAKNNGAGLIQIENLEGLKDDLSGTFIGQNWRYNQLQNFIKYKADEAGIKVHPVNPCYTSRRCSHCGFIHISFDREYRDKNRKNGKATMFECPKGCKPLNADYNAAKNLATFDIEEKIRLQCKQQSIEYKELPKD 204MPGTEKRLQKVATFEIVKPVNMSWPEFGKMLRDVRYRYWRLANMAVCENY CasM.19910MRFYQWRTQQTDANDRYKVKTLNRILRKMLIEEKNADEKELSRYSRDGAVSGYICGAFEKTKLSAVKSSSKWKKVIAGKESLPLFKKDLAIPINCSDHQPRLIERTQSGEYEVDLRICQQPYPRVLLSTAKISDGQKAILERLVSNETNSLPGYRHRFFEIKEKRNKWYLSVSYDFPKIDATRLHPNIIVGVDLGWSVPLYAAISNGYARIGYRKLKALGDRIKALQRQTIARRRSIQRTGEQDLSAPTARSGHGRKRILHPIEKLEGKIDNAYKTLNHQLSHCVIEFAKNHGAGLIQVENLKGLAEELSGTFIGQNWRYNQLQEFIKYKAKEAGIEVKEVNPCYTSRRCSECGFIHKEFTFEYRQANKKTDKATMFECPKCGYKAIADYNAARNLANPDIAEKIRLQCKEQGIEYKELPKD 205MPTITRKIELHFCTEGLSDEKQKEQRQLLYHINDNLYKAANNISSKLYLDEHV CasM.19948SSMVRLKHADYLSLQRELARAEKQKTPDDELITELSRKLSAAEKEMTDQELAICKYATEMATSTLAYNFAKEMETEIFGQILACLENNAHAVFVDDSLSVKRGERAIRNYKKGMPIPFPWNKNIKIETKDCEFYLRWYNGIRFRLHFGKDRSNNRLIVQRCLKLDDNFESEYKLCNSSIQLDKRDGKTKLFLLLVVNIPQEHVELNKNIVVGVDLGLNYPAYVATNSTEERKYIGDRDHFLKIRMQFQSRYKSLQRLKGTAGGKGRAKKLEPLERLRKAERNWVHTQNHLFSRDVVNFAVQTHAATIHMEDLSGFGKDNDGNADEKKEFVLRNWSYYELQSMIEYKAAKYGIKVEKIRPAYTSKTCSWCGQQGDRKSTTFICENPECKHYGESIHADYNAARNIANSKDIVKKNE 206MSKITRKIEIIPDIDGITHEESNKKCYNTFYKFDRKLYKVANLLVSQLYGLDNL CasM.265291LSLMRLQNDEYVKCQSKLSFKSITDATKEEIKKRMQEIDAELVSMKNDIAPKHPQTYSYRAVTSSEYAKDIPSDILNNLKQDVYQHFNENKKEQIRGERSLATYKKGMPIPFSFEKRHVIICDGDNYYLPWFEDTRFRLNFGRDRSNNRAIIDNCIKTKKYKLCAAAKIQLKERKLFLLITVDIPKAESVPVKGKVMGVDLGVINPAYVAVNDGPERSRIGNGEAFQKQRDVFRRRFRELQRSQLTQGGHGRKHKTKATEILRGKERNWVQTENHRISREIVNLASRWKVETIQMESLKGFGKNQEGEVEYNHKRLLGRWSYFELQKDIEYKAAMAGIAVQYVNPAYTSQTCHVCGQRGNRIERDTFICTNPECTCYNQAQDADMNAAINIAKSKDVIK 207MPTITRKIELTLCTDGLSDEERKAQWGLLYHINDNLYKAANNISSKLYLDEHV CasM.270012SSMVRLKHAEYLSLQKELAKAERQKMPDVDVIEELRERLSAAEQEMSDQELAICKYATEMSTNTLAYRFATEIETNIFGQILARLENNAQAVFLTDAPDVKRGERAIRNYKKGMPIPFPWNNSIKIECEGGEFYLRWYSGLRFHFNFGKDRSGNRLIVQRCLKLDKEYDGEYKLCNSSIQMVKRDGSTKFFLLMVVNIPQEYVELNKHIVVGVDLGINVPAYVATNITPERKAIGDREHFLNTRMAFQRRYKSLQRLKTTAGGKGRTKKLEPLERLRQAEHNWVHTQNHLFSREVVNFALQTHAATIHLEDLSGFGKDSDGNADERKEFVLRNWSYYELQNMITYKAAKYGIRVEKIRPAFTSRTCSCCGHEGFREGVTFICENPECQQFGEKVHADYNAARNIANSKDIIKKNE 208MPTITRKIELTLCTEGLSEEQRKEQWGLLYHINDNLYKAANNISSKLYLDDHV CasM.272451SSMVRMKHAEYLSLLKELARAEKQKTPDADAIAELRKKVAAAEKEMTDQEHAICKYATEMSTETLAYKFATEIETNVFGQILACLKQAAQSNFKNDAKDVKRGERAIRNYKKGMPIPFPWNDSLRIEKDNKDFYLRWYNGLRFLFNFGKDRSNNRLIVERCLKMDADYDGEYKLCNSSIQIAKREGKVKLFLLLVVSIPQEHVELNKKVVVGVDLGINVPAYVATNITEERKAIGDREHFLNTRMAFQRRYKSLQRLKGTAGGKGRTKKLEPLERLRKAEHNWVHTQNHLFSREVVDFAVKTHAATIHMEDLSGFGKDNDGNADERKEFVLRNWSFYELQNMITYKAAKYGIKVEKIRPAYTSKTCSCCGRQGFRSGVTFICENPECKQYGEKVHADYNAARNIANSKEIIKKNE 209MKNNVEEKRPDKEKRLTKVATFQIVKPVNMSWSEFGKMLRDVRYRLSRLAN CasM.274429MAVSEAYQNLHQRYRLKNQNAPKSVKIGQISRDLRKILLEEGLEEENLSKYSKMCVLPDTITGAFSRYKLSTIDWRKVLTGKISIPNYKANLSIPIRCDKPQQRRLERTETGEIEVDLMICNKPYPRVLLSTRTISDGQRSVLERLVLNNANSLPGYRHRIFEIKEKRNEWYLSVTYDFPKAETTKLHSDIIVGVDLGWSVPLYAAINNGYARIGYKQLKPLGDSIKALQRQTIARRRSIQRGGTQDLAAPTARSGHGIKRILQPIEKLEGKIDNAYKTLNHQLSHCVIEFAKNHGAGVIQIENLKGLAEELSGTFIGQNWRYYQLQEFIKYKAKEAGIIVKEVNPFYTSRRCSECGYIHKDFTFEYRQANRKNGKSTMFECPKKEEKGCKPLNADYNAARNLATSDIEDKIRLQCKEQGIEYKEI KEK 210VTLLVKVVKIHLISEQFDKAGNRIDYKEVNKILWELQKQTREAKNKTVQLLW CasM.277378EWNNFSSDYVKASGIYPKAKDIFGYSSVHGQANKELRTKLILNSSNLSTTTMDVCKIFNTYKKEVWEGKRSVPSYKSDQPLDLHNDSIKLIYENKEFYVRLGLLNRAGFAKYGFKDGFRFKMQVKDNSTKTILERCFDGIYTIVASKLLYDQKKNRWKLNLSYSFDNKNISELDKEKILGVDVGVSCPLVASVFGDRDRFIIKGGEIEKFRKSVEARRRSMLEQTKYCGDGRIGHGRKKRTEPALNIGDKIARFRDTTNHKYSRALIEYAVKKGCGTIQMEKLTGITSKADRFLKDWTYYDLQTKIENKAKEVGINVVYIAPKYTSQRCSKCGYIHKDNRPNQAKFRCLKCDFESNADYNASQNIGIKNIDKTIKKERKKQKSEAQVNEK 211MAGKKKDKDVINKTLSVRIIRPRYSDDIEKEISDEKAKRKQDGKTGELDRAFF CasM.280852SELKSRNPDIITNDELFPLFTEIQKNLTEIYNKSISLLYMKLIVEEEGGSTASALSAGPYKECKARFNSYISLGLRQKIQSNFRRKELKGFQVSLPTAKSDRFPIPFCHQVENGKGGFKVYETGDDFIFEVPLIKYTATNKKSTSGKNYTKVQLNNPPVPMNVPLLLSTMRRRQTKKGMQWNKDEGTNAELRRVMSGEYKVSYAEIIRRTRFGKHDDWFVNFSIKFKNKTDELNQNVRGGIDIGVSNPLVCAVTNGLDRYIVANNDIMAFNERAMARRRTLLRKNRFKRSGHGAKNKLEPITVLTEKNERFRKSILQRWAREVAEFFKRTSASVVNMEDLSGITEREDFFSTKLRTTWNYRLMQTTIENKLKEYGIAVNYISPKYTSQTCHSCGKRNDYFTFSYRSENNYPPFECKECNKVKC NADFNAAKNIALKVVL212 MPDTDKGKRLTKVATFQIVKPVNMSWNEFGKMLHDVRYRYWRLANMAVC CasM.281050ENYMRFYRWRTQQTDTNDHYKVKIINGILRKMLIEEKNADEKELSRYSRDGAVSGYVYGAFTQTKLSAITSKSKWGEVIKGKSALPLFKRDTSIPIMCTDKKPSMIEKTASGEYEVDLRICLKDKQLRPNGYPSVLLSTTKISDGQKAVLERLVSNKTNSLPGYRHRFFEVKEKRGDWYLSVSYDFPQAEATRLHPDIIVGVDLGWSVPLYAAINNGYARIGWRKLEPLAKSIKHLQKQTIVRRRSFQKGGKKDLAASTARTGHGIKRILQPIEKLEGKIDNAYKTLNHQLSHCIIEFAKNHGAGVIQIENLKGLAEELSGTFIGQNWRYHQLQEFIKYKAEEAGIAVKEVNPRYTSRRCSKCGYIHIGFDREYRDKNRKNGKSTMFECPECSKRIKDYKPLNADYNAAKNLATADIEEKIRL QCKEQGIEYKELPKD 213MPTITRKIKLELCTKGLSEEERKAQWNLLYHINDNLYRSANNISSKLYLDEHV CasM.285333SSLVWLKHKEHQTLKADLAKAKKQKIQDEKTIAELESRLKSCESEMSDQELAICKYTDEMSSKTLSYKFATELELNIYAQILTQVQSKVYADFQNDQKDVRDGKRAIRTYKKGMPIPFPWRNNIRMEPVKKGREYEFYIKWYNDIRFQLIFGKDRSNNRLILQRCFKLDENCVEDYQMRTSSIKMVKGANGTELFLYLVVDIPQEKHILNNKIVVGVDLGINVPAYVATNVTDDRKAIGDREHFLNTRMAISKRFHSFQRLKGTTGGRGKTKKLEPLERLKEKERNWVHTQNHLFSRDVITFALHVKAATIQMEDLSGYGKDDEGNVVEEKKFLLGKWSYYELQEMIKYKAKKVGMRVNFIKPAYSSQTCSWCGERGERNSTSFVCTNSECSHYGEDLHADYNAARNIARSKNIIRYE 214MIITRKIQILFAAQGEEFKKDKDTLYKWSNIVHHASNIVASNKYVCDHLQGM CasM.286285VYLTEEGKEAVSELSQKVDDIFNTSRMNTTYRMISSLYKGEIPTDILSCVNMQVSKLYNKERKKMADGDRSLRSYRSNIPIPFSANSLMRKWKYADKEYSFDLFGIPFKVVLGKDKSNNRSILERLMDGTYKAATSSIKIQNCEDETGKKTRKFFLLLCVEIPDKSYAGREDNILFAELSIDHPLLVSFPIKKEESKPIPIGNKQSYLYKRLQIQKGLDSCKASCKWNKGGRGRKRKMKSTERFKAKEHNFVDAYMHQISAALIKFAIKHDIGKLCLVDVDKKIKEAKESPFVLRNWSYYSLLTKIQYKAKMNGITVV MVDKNVL 215MPTITRKIRLHLCTDGLSEEERKAQWKMLYRINDNLYRAANNISSKLYLDEHI CasM.286678SSMVRLKHAEYTSLKTELLKAKKADDEETVAELEARINVLNAELSAQEEAICSYATEMATRTLAGKFASELDLNIYGQILAEVKSVVFKNFNSDSKDVREGKRSIRTYKKGMPIPFPWNKTIRLEAVKKESSSKHDEDEYEVYLNWYKSSRTEKKAIRFRLDFGKDKSNNQQIVKRCLNLDNTSSESYQLQTSSIQMKKGSEGAELYLLLVVNIPQDQHVLNKKIVVGVDLGINVPAYVATNCTEERKSIGDREHFLNARIAFHRRFHSFQKLKGTTGGRGRKKKLEPLERLREKERNWVHTQNHLISRDVINFALQTKAATIQMEDLSGYGKDEEGNVKPENKFLQSRWSYFELQSMIKYKAAKCGIKVNLINPSYTSQTCSWCGQMGVRESTSFVCQNPECKKYGKDIHADYNAARNI ARSNKTVKNE 216MPTITRKIELRLCTEGLSDEERKAQWMLLYHINDNLYRSANNISSKLYLDEHV CasM.287128SSMVRLKHAEYQSTAAELLKAKKNNADEGTISTLEDKVETLKTEMSAQGIAICNYATEMATRTLAGKFASELELNIYGQILAEVKNVVHTNFTNDAKEVREGKRSIRTYKKGMPIPFPWNKSIKIEPVKASSQNEGQDDYEFYLKWYNGLKFILHFGKDRSNNRQILKRCFGLDNLCNERYQMRTSSIQMKKGSNGMELYLLLVLSIPKEQHSLNKKVVVGVDLGINVPAYVATNCTEERRAIGDREQFLNTRMAIIRRKHSFQRLKGTAGGRGRKKKLDPLERLRETERNWVHTQNHLYSRDIIKFALETKAATIQMEKLKGFGRDDNGNVIEEKKFLLGKWSYYELQNMIKYKAGKVGIKVNFIAPAYTSQTCSCCGVRDDRNRKSTSFICHNPDCQMYGKEIHADYNAARNIAR SKNVIKDE 217MPAITRKIELTLCTEGLSEEQRKEQWGLLYHINDNLYKAANNISSKLYLDEHV CasM.287826SSMVRMKHADYLSLLKELARAEKQKTPDDELIAELREKLSLAEQEMTDQELAICNYATEMATSTLAYNFAKEIETEIFGQILACLENNAHAVFVDDSPTVRRGERAIRNYKKGMPIPFPWNKSIRIVEKDGEFYLRWYNGMRFLLTFGKDRSNNRIIMKRCLKMDQDFEGEYKLCNSSIQMVKREGKTKLFLLIVVNIPQEHVELNKNIVVGVDLGVNVPAYVATNITEERKAIGDREHFLNTRMQFQRRYKSLQRLKGTAGGKGRTKKLEPLERLRKAEHNWVHTQNHLFSREVVNFAVQTRAATIHMEDLSGFGKDNDGNADEQKEFVLRNWSFYELQNMIAYKAAKYGIKVEKVKPAYTSKTCSWCGQLGFRQGVTFICENPACKQCGEKVHADYNAARNIANSKDIIKKNE 218MAGQRHTKVAKFQILKPAADMRWSELGRLLRDAQYRVYRLANLALSEKYL CasM.287896RFHLFRTGQTESLPECRIGRLNRQLRQMLKDEGGADDSVLDRFSRTGALPDTVVGALWQYRLHALTKGEKWNKVTRGETALPTFRRSMALPIRCDKRIHHRLERAALDSVELDLMICTRPYPRVILKTAKLDDGAAAILERLLDNEGQLLEGYRQRCFEVRYAEDEKAWWLHVTYDSPATPAPHLSKDIIVGVDLGFSCPMYVALSNGDARLGRRQFAALAARIRSLQTQVMARRRQMLSGGKASLSGDTARSGHGRKRKLLPIESLEGRINRAYTTLNHQLSISVVHFAVHHGAGVIQIENLEGLQNELTGTFLGQRWRYHQLQEFLNYKANEAGIEVRRVNPRYTSRRCSKCGYIHVDFNRAFRDAARQEGKVARFCCPKCEYEAHPDYNAARNLATVDIEGIIKVQCERQGIDR PSVENQDEVAK 219MPTITRKIELTLCTEGLSDQERKDQWNLLYHINDNLYRAANNISSKLYLDDHV CasM.287936GSMVRLKHAEYLSLLRALEKAKKQKAPDEEVIAELSQQVATAEQEMDEQAKAICQYATEMSTQTLSYRFATELETNIFGQILTCLRQGVFSTFNSDARDVKRGERSIRTYKKGMPIPFPWNDSLRIGFEDGEFYLRWYNGLRFRFDFGKDRSNNRLIVQRCMKMDKDYEGDYKLCNSSIQMVKREGKPKFFLLLVVNIPQERVELNKNIVVGVDLGINAPAYVATNTTPERKQIGDREHFLNERMAFQRRFKSLQRLKSTTGGRGRAKKLEPLERLRKAEQNWVHTQNHLFSREVIDFAVKARAATIHMEDLSGFGKDRDGNADERKEFVLRNWSYYELQNMITYKAAKYGIKVEKIRPAYTSKTCSWCGHQGFREGITFICENPECKKFGEKEHADYNAARNIANSKEIIKNNEE 220MPTITRKIELSLCTDGLSDEQLKEQRQLLYHINDNLYRAANNVSSKLYLDEHV CasM.288450SSMVRLKHADYLSLLRDLARAEKQKSPDEALISELRSKLAAAQREMTEQELAICRYATEMSTQSLSYRFVTEMETHIFAKILDCLKQGVYATFNSDARDVKRGERSIRNYKKGMPIPFAWSDSVRIEQEADEFYLRWYNGIRFRLVFGKDRSNNRLIVKRCLKLDKDYEGDYKLCNSSVQMVKREGKPKTFLLLVVKIPQEQVELNKKIVLGVDLGINYPVYAATNCTEERIYFGEREHFLNTRMQFQRRYKSLQRLKGTTGGKGRKKKLEPLERLRKAERNWVHTQNHLFSQKTVDFALQTHAATIHLEDLSGFGRDSDGSAEEKKEFVLRNWSYYELQQMITYKAAKYGIKVEKIRPAYTSQTCSWCGQRGFRQGVTFICENPECKKCGEKEQADYNAARNIAKSKDVIKDDDE 221MSIVTRKIELIPDIENLTHEESNQRCYKLLYNIDKKLYKLANLLVCQLFGLDNL CasM.288712LSLMRLQNDEYVKFQSKLASKSISKETQKNIKEHMKEIDKELLARKAEIAPKSPLTFAYRAIKGSLYAKDLPSDIFNTLKQDVFKHFNETKKEQLRGERSLATYKRGIPIPFSLMKKNVIVSEGDNYYLTWFEETRFKLNFGKDRSNNRAIIDNCLKTNKYKLCTAAKIQLKNKKLFFLVTVDIPETKNTIIKGKVMGVDLGVVHPAYVAVNDGPERSLIGDGDAFQKQRDVFRRRFKELQRCQLTQGGHGRKHKTKATEILRGKERNWVQTENHRISRKIVNLAIRWKVESIQMENLKGFGKDSEGEVETKHKRLLGRWSYFELQKDIEYKAQKAGIKVVYINPAYTSQTCHVCGKKGDRTERDTFICLNTECSCYGKPQDADMNAAINIARSKNIVK 222MPTITRKIELMLCSEGLSDEQRKEQWGLLYHINDNLYKAANNISSKLYLDEHV CasM.289248SSMVRMKHAEYLSLLKELARAEKQQTPDEGLIAELSRKLSAAEKEMADQELAICKYATEMSTQTLSYNFAKEIETNIFGQILTCLRQGVYATFNSDAKDVKRGERAIRNYKKGMPIPFPWNKSLKIEAEGGDFYLRWYNGLRFLLTFGKDRSNNRMIVKRCMKMDEDFEGEYKLCNSSIQLAKRDGKPKLFLLLVVNIPQEHVELNKKIVVGVDLGVNVPAYVATNITEERKAIGDREHFLNTRMAFQRRYKSLQRLKGTAGGKGRTKKLEPLERLRDAERNWVHTQNHLFSREVVNFAVQARAATIHMEDMSGFGKDKDGNADEKKEFVLRNWSFYELQNMIAYKSAKYGIKVVKIRPAYTSKTCSWCGQQGERKSTTFICENPECKHYGESIHADYNAARNIANSNDIVKENE 223MVITRKIEVFVCESDNDLRRSYYEKLYDIRNIAQEAANRATSMLYAIDNLIPCL CasM.289726DEDSRKLIQYIGAKGTPASRQNAAYTIMSHLYKDRMPGIMDMLSNLAQYVTKNYSEDRKRGMYKNALRSYKCSLPVPYQKKSFKGLRFNWYEDSDGDAHEGCFFSLAGVPLQMRFGRDRSNNRLIVERVISGEYKMCTSSLKFDGKKLFLLLCVDIPKQEANVDPKKTLYAYLGVMNPIICTCDVRAKQEYDSGYKCFEIGTKEEFNYRRRQIQEAVRRCQINNRYSSGGKGRKKKCQAIERWHEKEKNYVDTKLHTYSRMLVDLAVAHKCGTIVLLNQKKREDKAKDDNQNGEPFVLRNWSYYNLKDKIGYKCKLAGIKLVQDKEETEEE 224MVITRKIEVFVCEDSKDLRKEYYDKIYKCRDIAVKTANLGVSHLFMLDNTTP CasM.289802YLSDDDREKLTFLGCSGKKATKQNAPYVAASEKFKGQADMSMLSSVLQNVGKMYQDDKKKGMWSKSLRSYKANMPIPFKASCYRNLRFADYNDKEDKPHNGCFFTLMGIPFQCKFGKDRSGNRIIMQAVVDGKYKMCTSSLQIDGKKIFLLLCVDIPKKVVKLDESKTLYAFLGVMNPIVCTTDIKQKGDIDTDWKLWEIGTEAEFNYRRRQIQEAVKRCQVNNRYSRGGHGRFAKTKAIERWRAVERNYVDTKLHTYSKMLIDLAVKHKCGKIVLMNQLHREDAAKDDKFVLRNWSYHSLRTKIDYKA KMYGIKVEVEK 225MPVITRKIKLNLCTEGLSEDERKAQWKMLYRINDNLYRAANNISSKLYLDEH CasM.290380VSSMVRLKNAEYTSLVSDLMKAKKAEDEAAITDLEAKIESLKSEMTAQEEAICCYATEMATRTLAGKFASELDLDIYGQILAEVKSVVFKNFNSDSKEVREGNRSIRTYKKGMPIPFPWNKTIRLEAVKKELSGKHDEDEYDFYLNWYKSSRTDKKAIRFRLYFGKDKSNNQQIVKRCLHLDSTSSENYQMQTSSIQMKKGPEGAELYLLLVVNIPQEQHALNKKIVVGVDLGINVPAYVATNCTEERKAIGDRDHFLNTRMAFSRRFHSFQRLKGTSGGKGRKKKLEPLERLREKERNWVHTQNHLISRDVINFALQVKAATIQMEDLSGYGKDEEGNVKPENKFLQSKWSYFELQSMIKYKAAKCGIKVNLIAPAYTSQTCSWCGQMGIRESTSFVCQNPECKQYGKDIHADYN AARNIARSNKIVKNE 226MRISKTLSLRIVRPFYTPEVEAGIKAEKDKREAQGQTRSLDAKFFNELKKKHS CasM.292901EIILSSEFYSLLSEVQRQLTSIYNHAMSNLYHKIIVEGEKTSTSKALSNIGYDECKAIFPSYMALGLRQKIQSNFRRRDLKNFRMAVPTAKSDKFPIPIYRQVDGSKGGFKISENDGKDFIVELPLVDYVAEEVKTAKGRFTKINISKPPKIKNIPVILSTLRRRQSGQWFSDDGTNAEIRRVISGEYKVSWIEIVRRTRFGKHDDWFVNMVIKYDKPEEGLDSKVVGGIDVGVSSPLVCALNNSLDRYFVKSSDIIAFNKRAMARRRTLLRQNKYKRSGHGSKNKLEPITVLTEKNERFKKSIMQRWAKEVAEFFRGKGASVVRMEELSGLKEKDNFFSSYLRMYWNYGQLQQIIENKLKEYGIKVNYVSPKDTSKKCHSCTHINEFFTFEYRQKNNFPLFKCEKCGVECSADYNAAKNMAI A 227MEEKTKRLQKVAKFQIVKPVNMTWVELGKMLRDVRYRLWRLANMAVCEN CasM.293203YMRFYQWRIGKTDANENHKVKILNRRLREMIIEEKQADAKELMRYSRDGVVSGYICGAFEKIHLSAIKNKSKWREVIRGKSNLPLFKRDLPIPINCSDHKPSLIAKTESDEYEVDLRICQKPYPRVLLSTAKISGGERAILERLVSNKTNSLPGYRHRFFEIKEKPKGRWNLHVTYDFARSEATMLHSDIIVGVDLGWSVPLYAAVNKGHARIGWRKLEPLAKRIRHLQKQVKARRLSVQKGGQRDLAAPTARAGHGRKRILQPIEKLEGKIDDAYKTLNHQLSHCVIEFAKNNGAGVIQVENLEGLKDTLTGTFIGQNWRYNQLQNYIEYKAKEAGMELKKVNPCQTSQRCSNCGFIHRDFTFEYRQANKKNGKAAMFECPECSKKENYKPLNADYNAARNLATAGIEGKIRLQCEK QGIEYKGLPEE 228MSKITRKIEIIPDIEGLTHDESNKKCYGAFYTFDKNLYKVANLLVSQLYGLDN CasM.294190LLSLMRLQNDEYVKCQSKLSLKSTTDAEKENLKKRMKKIDAELVSIKNGMAPKHPQTFAYRAVTNCVYAKNIPSDILNTLKQDVYKHFNDTKKEQFLGERSLTTYKRGMPVPFSIEKKHAIVCDGDNYYLPWFEDTRFRLNFGRDKSNNRAIIDNCIKTKRYKLCAAAKIQLKDKKLFLLVTVDIPATETTSVKGKVMGVDLGVVNPAYVAVNDGPERSRIGNGEAFQKQRDVFRRRFRELQRSQLTQGGHGRKHKTKATETLRGKERNWVQTENHRISREIVNLASRWKVECIQMESLKGYGKNQEGEVEDNHKRLLGRWSYFELQKDIEYKAAMVGIQVKYINPAYTSQTCHVCGQRGNRIERDTFICTNPECTCYNQAQDADMNAAINIAKSKDVVK 229MPTITRKIEMKLCTEGLSDQERKDQWNLLYHINDNLYRAANNISSKLYLDDH CasM.294406VLSMVRLKHAEYLGLLRALEKAKKQKIPDEEVIAELSQKVAAAEQEMDDQAKAICQYATEMSTQSLSYRFATELETGIFTKILDCLKQGVFATFNSDTRDVKRGERSIRTYKKGMPIPFAWNDSLRIELEDGEFYLRWYNGLRFRFDFGKDRSNNRLIVRRCLNMDEDYEGDYKLCNSSIQMVKREGLAKFFLLMVVNIPQEQVELNKKIVVGVDLGINAPAYVATNITSERKQIGDREHFLNERMAFQRRFKSLQRLKGTTGGRGRAKKLEPLERLRKAEQNWVHTQNHLFSREVIDFAVKSRAATIHMEDLSGFGKDRDGNADDKKEFVLRNWSYYELQSMITYKAAKYGIKVEKIRPAYTSKTCSWCGHQGFREGITFICENPECKKYGEKEHADYNAARNIANSIEIVKNNEE 230MKDYIRKTLSLRILRPYYGEEIEKEIAAAKKKSQAEGGDGALDNKFWDRLKA CasM.294601EHPEIISSREFYDLLDAIQRETTLYYNRAISKLYHSLIVEREQVSTAKALSAGPYHEFREKFNAYISLGLREKIQSNFRRKELARYQVALPTAKSDTFPIPIYKGFDKNGKGGFKVREIENGDFVIDLPLMAYHRVGGKAGREYIELDRPPAVLNVPVILSTSRRRANKTWFRDEGTDAEIRRVMAGEYKVSWVEILQRKRFGKPYGGWYVNFTIKYQPRDYGLDPKVKGGIDIGLSSPLVCAVTNSLARLTIRDNDLVAFNRKAMARRRTLLRQNRYKRSGHGSANKLKPIEALTEKNELYRKAIMRRWAREAADFFRQHRAATVNMEDLTGIKDREDYFSQMLRCYWNYSQLQTMLENKLKEYGIAVKYIEPKDTSKTCHSCGHVNEYFDFNYRSAHKFPMFKCEKCGVECGADYN AARNIAQA 231MPFKVLKLKIIKPVNMDWNELGQSIRDTRYRVYRLANLAVSEAYLAFHLWR CasM.294655AGKTDAIPKATAGQLNRRLRDMLLEEARTKAVKDRKNTGEKGTEDDAKKAQKEMNKFSKTGALPDTVAGALFMYKVKGLISKGKWTQVIRGKSALPTFRNNMAIPIRCDKKTQRRLERTENGVELELMIRNKPYPRVLLGTQGIGEGAEAIIERLLSNESQAEQGYKQRYFEVREDVNRTWWLYVCYALPASTPPRLDPSKIVGVDLGFTCPMYAAISNGHARLGYRAFSSLAARVKALKLRTMRRRREIQRGGRTIVSGEAARSGHGRKRKLLGIEKLQGRVNQAYTTLNHQMSAAVVKFAIENGAGTIQVENLEGLREELSGTFLGQMWRYFQLQEFLQYKAEENGIVIRKVNPRYTSRRCSQCGHINKEFTRKARDRNAEGGYSAKFKCPDCEYEADADYNAAKNLAVDG IEGIIEKQCGSQGIVL 232MFLYKELKTMAKTNAEEGKIENKEKRLTKVAKFQIVKPVNMTWPEFGKMLG CasM.295201DVRYRLSRVANMAVTEKYLESQQKRTGQKIQRENTLVTIANRKLREMLKKEKVKEEELDRYSRDGAVSGYVTGPFEHNKLSAISKRFKEVLKGNMSLPNFKREMAIPINCSNAKLSTIEKTETGEYVVDLRISQKPWPRVLLSTNRISNGQREILERLAANKTFSDDGYKHLFFEVKQQGKDWFLSVTYSFPKSEAPKLHKDIIVGVDLGWSVPLYAAVNKGYARIGWQKFRPLAERIKHLQKQVKARRITIQKGGQQDLATPTARTGHGRKRILRPIEKLERKIENAYTTLNHQLSHCVIEFAKNNGAGVIQIENLSGLANELSGTYIGQNWRYEQLQEYIRYKAEEAGIEVKHVNPCRTSQRCSECGFINDKFNFEYRQANRNNGMSAMFECPECKKNKKDYKPINADYNAAKNLTTANIDEIIRLQCKKQGIEYKELPKD 233MSKITRKIELIPDIENLTHEESNQRCYKVFYNIDNKLYKVANLLVCQLFGLDNL CasM.296640LSLMRLQNDEYVKCQSKLASKSISEETKRDIKKRMEAIDKELLARKDEIAPKHPQTFAYRAIKDSDYAKDLPSDIFNTLKQDVFKHFNETKKEQLRGERSLTTYKRGIPVPFNLMKKNVIVSDGDNYYLTWFEETRFKLNFGKDRSNNRAIIDNCLKTNKYKLCTAAKIQLKNKKLFLLVTVDIPETKNKIIKGKVMGVDLGVVHPAYVAVNDGPERSLIGDGDAFQKQRDVFRRRFRELQRCQLTQGGHGRKHKTKATENLRGKERNWVQTENHRISREIVNLAIRWRVETIQMENLKGFGKDSDGDVETKHQRLLGRWSYFELQKDIEYKAAMAGIKVVYVNPAYTSQTCHVCGERGDRTERDTFICTNTECDCYGKPQDADMNAAINIARSKNIVK 234MTKVVKLPLICEQSDKDGNPIDYKKIYEILFELQRQTREIKNKSIQYCWEFSNF CasM.296642SSDYYKQNHEYPKEKDILSYTLVGFVNDKFKTGNDLYSGNCSTTVRGACGEFKNSKTDFLKGTKSIINYKGNQPLDLHNKTIRFECIGKDYYAYLKLLNRPAFQRNNFSSSEIKFKVLVYDNSSKTIVERCIDNIYKISASKLIYNEKKKCWVLNLSYSFTNNNVCELDENKILGVDLGIHYPICASVNGERKFFKIDGGEIDHTRRKIEVRKKSLLKQGSSCGEGRIGHGIKTRNKPVYNIEDKIACFRDTANHKYSRALINYAVNNNCGIIQMEKLTGITADSDRFLKNWSYFDLQTKIEYKAKEAGITVVYIDPQYTSQRCSKCGYISKENRKVQAKFCCQKCGYEANADYNASQNIGIKDIDKIIKNTK 235VPITKTISLRILRPYYPPEIEAKIKAEKEKRKENGDTGSLNSSYYRELKKEYPSII CasM.298142INDEFFPLLSEMQRNITSIYNRTISHLYHRLIIKKESISTAKALSEGPYRDFKSTFNSYIALGLRQKVQSNFRKKDLMAFKIALPTAKSDKFPIPIYMQTNFKIKESPDSDFIIELPLVEYIAKETKGKNKMFTKVEILSPPKVKNIPVILSTRRRKESGQWFSDEGTNAEIRRIISGEYKVSWIEIVKRTRFGKHDWFVNMVISFEESQEGLDPDVIGGIDIGVSKPLICAINNSLDRYIVKGDDIIAFNRRALSRRRSLLRRNRLKRSGHGSRNKLEPITVLTEKNERFKKSIMQRWAKEVAEFFKSKRASIVQMEELTGIKEREDFFSKTLRMYWNYGQLQKTVENKLREYGIEVRYASPKDTSRRCHSCGHINDYFTFEFRQQNNFPLFKCMNCGIECSADYNAARNIAIAR 236MNRIYQGRVSKVEIPDGKDEWKKLDDGESALWQHHQLFQDDVNYLLAAFA CasM.298248ALVPTSCEDDIWKDYQAAIERSWESYTGRQGIWDRPFENACVIVGCKKDASFKEFRRKLNSLTGSKASEKQKFEALKQLFEPATEAAKKLKKHDEPVEESLKGKAKDLFGSTLVNLCAQKTKVTPRDVIAKQRNRASECTKKVNEGERLKWADVFYFKTDTSAAKWSREDAAKNIIQFLDKLLGEVEEKEKDAKTSDQKKKMADLAERLEKQKKPLAAWCNNSKTDLPTTEPTRKGSGGYDLKAAVLFSLQPDLDGFRDAFLLFNQARLKEEFATTEKGDAAYIARMAGGVARPVFPFFCDVWAGKVNDEKIGQGIWPDFEKQAFSEVFTKIGQFIVRGRKFELRLAIADQIIAKIETQKKSDARLQAVERIAEDLADELPDTAVDENGQKRPYGIRERTLKGWRKVRPAWREALKKTPNLTAEDLIKQKNRMQERQREKYGSASLFDRLAKEPEIWNHDDKEDAVETWADYVENLEEKAHLETERLFAPAHATLSPRFFRWSETNNKEHLEASSPDVPFELKADALDLSKKEKSQIKIHFWSPRLWRDGLRGKKENLDKDEPDQNWMPPVLRAFVKARKWPCDKQSFAGASVRLAPRCKENIQLVFEPELHTEILSAKWKENFPFSPAKNKESESVGLFWPRTKEDKVLWFDKGETRCLGVDLGLTNSAAWQILQATNKDATAKAPRLRHRLNPDSEKAAWFAHSITNGIVRVAGEDCWGWRKFAPDEKAKLRAELKKPAGKRNALCRKFLSLNREIEFETATHSFLPELSGSGGRNPTDDETKEAAEFFSTLKTKGFDITDRQPSWGKNLSFPKQNDELLWGLKRVRAQLFRLNRWSEQLGKERDSKPYQSAIEIIGNLRSDDPLIELATLKSEPKRLKSRIAELAGEYLDCFKTLLPRIADRILPWRRGHWSWKPCDNDWHRMELDASKPRPEALLAGQRGISLPRLNQLKDLRQLAQSLNHLCRRKQIKRNETVPEPFEDCRQAMEDAREDRAKKIAHEVFAIALGVELAPPPPDKQERKQTESLHGVYRCLERGPVNFIALENLGGYNPSAKQGRRENRQLSSWLKGRIHKILGELCEMVGMPIVLVNAEYTSRFSAKDHSPGFRAEEVQTDDSRRSFWQRKAKEEPSGWQNEFLCWLNKVPDGKSLLLPKKGGEFFVPLGEGTSLYHADLNAAYRIALRALAHRDRAELLGQTWIEKKPYLVDVAGVFPDSILRNGCAFKTISSSERLWEKVNGDLAMQRCREINLARFASWKIALPQQIISEALPPDEEDDIPM 237MSEATKTLAYRYRLRLTPAQEDILDRSQEQLRLVWNHLVRSQHKVEHEWRH CasM.298264GRAASIKNELLELSLAKNATGQAIPSARKITEERGVSMEEALRLMRQKFVEKVSAIPLRKKDGSRCLRIARRKMATEYAVTVVNAKFKHYYGLGARMCKVLRDKFQKCSDMWIKGKFRRPRFKRKGESVALQRQVQSNSPFKLKRFSDLSALGGQALKKCEVIIHRPLPDSAEIKQIAVSGRRGQRHLIVMFKAASSDVAKNFPATNRTAGVDPGIKVALTITPLDSPDFGTSDKIEKQPDLARDACFLKRLRRLQRKHDRQRRQNNPECFDEKGRWIKGKRLHNESKNMQRTQSRITAMNTHLAESRRDFYHNAACEILRSFDNVAVGKWRPAQTRQRKPTTPSPKGLGAARRATNRISYDHAISLFISYLKDKAERSVTTKHVQEVSEFGSTRSCPKCGKLTGPVGTEGLAVRDWTCVNCNTTFQRDAASAWQIAKRFKAEVASTSQPAESQDSANSASVLTQV 238MPTLTRKVELYVVGDKEEVSRVYDYIRLAMNATYKCFNECMTALYIAQVKE CasM.298446DTKEDRKELNHLYSRQTYTKKETAFTNDIVFPEGLALAAYVNRMAQQKFVTSLKNGLMYGCVSLPTFKKDCAVPLHVKFVSLAGEKGTNTGFYHEYADVNDLVNALEYDNSPKVFLRFPNNITFGVVFGNPYRGREQRSVFSKIFLGEYKIQGSSIQINSRGKIILNLSMEVPKKKMEHIEGRVVGVDVGLAIPAMCAINDDDYTRSAIGNIDDFLKVRTQIQSQRRRLQKSLKNTSSGHGRTKKLKPLERIAEKERNFANTYNHMVSKRVVDFAVKNGASQINIEDLSGFAKDKNGKSVEDDNMKRVLSNWSYFELQQQIRYKAEQYDIKVRTVNPAYTSQTCSYCGQIGKRETQSKFVCTNPDCKCHKMYKKDWFNADFNAARNIALSTDYTDDEDGKKTKKKKSAKKKPEKK TEEA 239MSGASGQITRDNKAQRSGPNKGEMSEDHSSTKRPKRVVKVAKYRIIKPVGEM CasM.298612TWPELGEILRTVRYRVFRLGNLAVSEAYLNFHAFRTGKAEEFKSETIGKLSRRLRDMLISEGVKKEDIDRYSATGAVPDTVAGALGQYKVRGITSPAKWRQVIRGTVSLPTFRNDMAIPVRCDKPAQRRLEKAKSEEVEVDLMICRKPYPRVLIGTADLGGGQQAILERLLDNKDNSSDGYRQRLFEIKQDTQSKKWFLFVTYDFPSSGALPLDPNVAVGVDLGVSVPLYAAINNGHARLGRRQFQALGSRIRSLQTQVDARRRAIQRGGRSDVSQSTARSGHGVRRKLQPTEKLRKRIDRSYSTLNHQLSAAVVEFAKNQGAGTVQMEDLGGLREELTGTFIGARWRYHQLQQFLEYKCDEAGITLNKVNPMYTSRRCSECGFIDKDFDRAFRDRSRSDGRVARFICPECSYEADPDYNAARNIATLDIDKLIRVQCQKQGLKYDAL 240MIITRKIELWLSEDDNELRKAKWSYLKELNDEVYRAANFIVNNQYFNEILENR CasM.299584VIMQDTRLIDIDSEIRKLYKSREKNKEKIDELKKIKKIRYQEAKNFYQTSKQNVTYQLTSREFPNIPANIVTSLNASIIKTLKTEWNEIKSGKRAVRNYRKGMPIPFNFSSSQKWFENKGEDIFLNWLGGLKFKLFFGRDKSNNRAIVERAINKEYKYADSSIQLKDKKIFLLFVVDIPYEKANLNKNIAAGVDLGIAFPAFCALSEGYSRLSIGNKEDLLKVRLQMQSRRKRLQKALKITSGGKGRTKKLKALESLTNKEKNYVTTYNHKVSYQVIKFAKDNKAGIIKLEFLEGFGEDEKNKFILRNWSFYQLQKMIEYKAKREGIEVLYIDPYHTSQTCAICGNYEEGQREKQEDFICKNPECKNFEKIVNADYNAALNIAKSNKIVSSSEQCEYNKKHENNVL 728MPTITRKIELTLCTEGLSDQERKDQWNLLYHINDNLYRAANNISSKLYLDDHV CasM.286251GSMVRLKHAEYLSLLRALEKAKKQKAPDEEVIAELSQQVATAEQEMDEQAK (D267A)AICQYATEMSTQTLSYRFATELETNIFGQILTCLRQGVFSTFNSDARDVKRGERSIRTYKKGMPIPFPWNDSLRIGFEDGEFYLRWYNGLRFRFDFGKDRSNNCLIVQRCMKMDKDYEGDYKLCNSSIQMVKREGKPKFFLLLVVNIPQERVELNKNIVVGVALGINAPAYVATNTTPERKQIGDREHFLNERMAFQRRFKSLQRLKGTTGGRGRAKKLEPLERLRKAEQNWVHTQNHLFSREVIDFAVKARAATIHMEDLSGFGKDNDGNADERKEFVLRNWSYYELQNMITYKAAKYGIKVEKIRPAYTSKTCSWCGHQGFREGITFICENPECKKFGEKEHADYNAARNIANSKEIIKNNE E 729MPTITRKIELTLLTEGLSEEQRKEQWGLLYHINDNLYKAANNISSKLYLDDHV CasM.19952SSMVRMKHAEYLSLLKELARAEKQKTPDADAIAELRKKVAAAEKEMTDQEH (D267A)AICKYATEMSTQSLSYRFATELETNIFAKILDCLKQGVFATFNSDARDVKRGERAIRNYKKGMPIPFAWDKSLRIEKDNKDFYLRWYNGLRFLFNFGKDRSNNRLIVERCLKMDADYDGEYKLCNSSIQIAKREGKTKLFLLLVVKIPQEHVELNKKVVVGVALGINVPAYVATNITEERKAIGDREHFLNSRMAFQRRYKSLQRLRGTAGGKGRAKKLEPLERLRKAEHNWVHTQNHLFSREVVDFAVKSHAATIHMEDLSGFGKDNDGNADERKEFVLRNWSYYELQNMIAYKAAKYGIKVEKIHPAYTSKTCSWCGQLGFREGVTFICENPECKQCGEKVHADYNAARNIANSKDIIKKNE 730MPTITRKIELTLLTEGLSEEQRKEQWGLLYHINDNLYKAANNISSKLYLDDHV CasM.19952SSMVRMKHAEYLSLLKELARAEKQKTPDADAIAELRKKVAAAEKEMTDQEH (D267N)AICKYATEMSTQSLSYRFATELETNIFAKILDCLKQGVFATFNSDARDVKRGERAIRNYKKGMPIPFAWDKSLRIEKDNKDFYLRWYNGLRFLFNFGKDRSNNRLIVERCLKMDADYDGEYKLCNSSIQIAKREGKTKLFLLLVVKIPQEHVELNKKVVVGVNLGINVPAYVATNITEERKAIGDREHFLNSRMAFQRRYKSLQRLRGTAGGKGRAKKLEPLERLRKAEHNWVHTQNHLFSREVVDFAVKSHAATIHMEDLSGFGKDNDGNADERKEFVLRNWSYYELQNMIAYKAAKYGIKVEKIHPAYTSKTCSWCGQLGFREGVTFICENPECKQCGEKVHADYNAARNIANSKDIIKKNE 731MPTITRKIELTLLTEGLSEEQRKEQWGLLYHINDNLYKAANNISSKLYLDDHV CasM.19952SSMVRMKHAEYLSLLKELARAEKQKTPDADAIAELRKKVAAAEKEMTDQEH (E363Q)AICKYATEMSTQSLSYRFATELETNIFAKILDCLKQGVFATFNSDARDVKRGERAIRNYKKGMPIPFAWDKSLRIEKDNKDFYLRWYNGLRFLFNFGKDRSNNRLIVERCLKMDADYDGEYKLCNSSIQIAKREGKTKLFLLLVVKIPQEHVELNKKVVVGVDLGINVPAYVATNITEERKAIGDREHFLNSRMAFQRRYKSLQRLRGTAGGKGRAKKLEPLERLRKAEHNWVHTQNHLFSREVVDFAVKSHAATIHMQDLSGFGKDNDGNADERKEFVLRNWSYYELQNMIAYKAAKYGIKVEKIHPAYTSKTCSWCGQLGFREGVTFICENPECKQCGEKVHADYNAARNIANSKDIIKKNE

One technological advantage of CasM.19952 is its ability to create ablunt end cut or nearly blunt end cut, also referred to as a “shortstagger” cut. This is demonstrated in Example 24. As a consequence ofblunt cutting, there is a less likely chance of perfect repair ascompared to a Cas nuclease that makes a staggered cut. The substantialoverhangs of a staggered cut increases the chances that the cut will“spontaneously” repair, and decrease the chances of successful DNAediting, modification or donor insertion. In some instances, CasM.19952cleaves double stranded DNA (dsDNA) resulting in two dsDNA ends. In someinstances, at least one dsDNA end is a blunt end. A blunt end has nooverhanging nucleotides. In some instances, at least one dsDNA end hasat least one overhanging nucleotide. In some instances, at least onedsDNA end has less than 10, less than 9, less than 8, less than 7, lessthan 6, less than 5, less than 4, or less than 3 overhangingnucleotides. In some instances, at least one dsDNA end does not havemore than two overhanging nucleotides. In some instances, neither dsDNAend has more than two overhanging nucleotides. Determination of the lackof or extent of an overhang can be determined by Sanger cutsite mapping,e.g., a forward primer to sequence (report on) the target strand and areverse primer to sequence (report on) the non target strand.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the effector protein comprises anamino acid sequence that is at least 65%, at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 98%, atleast 99%, or 100% identical to any one of SEQ ID NOs: 1-45. In someinstances, the amino acid sequence of the effector protein is at least65%, at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 97%, or at least 98%, at least 99%, or 100%identical to any one of SEQ ID NO: 1-SEQ ID NO: 45.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 1. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 1.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 2. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 2.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 3. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 3.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 4. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 4.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 5. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 5.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 6. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 6.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 7. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 7.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 8. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 8.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 9. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 9.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 10. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 10.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 11. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 11.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 12. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 12.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 13. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 13.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 14. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 14.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 15. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 15.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 16. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 16.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 17. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 17.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 18. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 18.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 19. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 19.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 20. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 20.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 21. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 21.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 22. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 22.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 23. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 23. Insome instances, the engineered guide nucleic acid comprises a sequencethat is at least at least 65%, at least 70%, at least 75%, at least 80%,at least 85%, at least 90%, at least 95%, at least 97%, or at least 98%,at least 99%, or 100% identical to the following sequence or an equallength portion thereof:TGGGGCAGTTGGTTGCCCTTAGCCTGAGGCATTTATTGCACTCGGGAAGTACCATTTCTCAGAAATGGTACATCCAAC (SEQ ID NO: 186). The equal length portion thereofmay be about 40 nucleotides, about 50 nucleotides, about 60 nucleotides,or about 70 nucleotides.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 24. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 24.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 25. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 25.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 26. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 26.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 27. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 27.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 28. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 28.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 29. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 29.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 30. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 30.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 31. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 31.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 32. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 32.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 33. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 33.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 34. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 34.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 35. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 35.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 36. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 36.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 37. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 37.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 38. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 38.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 39. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 39.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 40. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 40.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 41. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 41.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 42. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 42.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 43. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 43.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 44. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 44.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 45. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 45.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 202. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 202.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 203. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 203.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 204. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 204.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 205. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 205.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 206. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 206.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 207. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 207.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 208. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 208.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 209. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 209.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 210. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 210.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 211. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 211.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 212. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 212.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 213. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 213.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 214. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 214.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 215. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 215.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 216. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 216.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 217. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 217.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 218. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 218.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 219. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 219.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 220. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 220.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 221. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 221.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 222. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 222.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 223. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 223.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 224. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 224.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 225. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 225.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 226. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 226.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 227. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 227.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 228. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 228.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 229. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 229.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 230. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 230.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 231. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 231.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 232. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 232.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 233. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 233.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 234. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 234.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 235. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 235.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 236. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 236.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 237. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 237.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 238. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 238.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 239. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 239.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein amino acid sequence of theeffector protein is at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 97%, or at least98%, at least 99%, or 100% identical to SEQ ID NO: 240. In certaininstances, compositions comprise an effector protein and an engineeredguide nucleic acid, wherein the amino acid sequence of the effectorprotein comprises at least about 200, at least about 220, at least about240, at least about 260, at least about 280, at least about 300, atleast about 320, at least about 340, at least about 360, at least about380, or at least about 400 contiguous amino acids of SEQ ID NO: 240.

In some cases, the D2S effector proteins comprise a RuvC domain (e.g., apartial RuvC domain). In some instances, the RuvC domain may be definedby a single, contiguous sequence, or a set of partial RuvC domains thatare not contiguous with respect to the primary amino acid sequence ofthe protein. A D2S effector protein of the present disclosure mayinclude multiple partial RuvC domains, which may combine to generate aRuvC domain with substrate binding or catalytic activity. For example, aD2S Effector Protein may include 3 partial RuvC domains (RuvC-I,RuvC-II, and RuvC-III, also referred to herein as subdomains) that arenot contiguous with respect to the primary amino acid sequence of theD2S effector protein, but form a RuvC domain once the protein isproduced and folds. In some instances, a partial RuvC domain is a RuvCsubdomain. In many cases, D2S effector proteins comprise a recognitiondomain (e.g., a REC domain) with a binding affinity for a guide nucleicacid or for a guide nucleic acid-target nucleic acid heteroduplex. Aneffector protein may comprise a zinc finger domain.

In certain instances, the amino acid sequence of the D2S effectorprotein comprises an amino acid alteration. In certain instances, theamino acid sequence of the D2S effector protein comprises one or moreamino acid alterations. In certain instances, the amino acid sequence ofthe D2S effector protein comprises two, three, four, five, six, seven,eight, nine, ten or more amino acid alterations. In certain instances,the amino acid sequence of the D2S effector protein comprises two,three, four, five, six, seven, eight, nine, or ten amino acidalterations. In certain instances, the amino acid sequence of the D2Seffector protein comprises at least two amino acid alterations. Incertain instances, the amino acid sequence of the D2S effector proteincomprises at least three amino acid alterations. In certain instances,the amino acid sequence of the D2S effector protein comprises at leastfour amino acid alterations. In certain instances, the amino acidsequence of the D2S effector protein comprises at least at least fiveamino acid alterations. In certain instances, the amino acid sequence ofthe D2S effector protein comprises at least six amino acid alterations.In certain instances, the amino acid sequence of the D2S effectorprotein comprises at least seven amino acid alterations. In certaininstances, the amino acid sequence of the D2S effector protein comprisesat least eight amino acid alterations. In certain instances, the aminoacid sequence of the D2S effector protein comprises at least nine aminoacid alterations. In certain instances, the amino acid sequence of theD2S effector protein comprises at least ten amino acid alterations. Insome instances, the amino acid sequence of the D2S effector protein isat least 65%, at least 70%, at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, at least 97%, or at least 98%, at least 99%, or100% identical to SEQ ID NO: 23 wherein the amino acid sequence of theD2S effector protein comprises one or more amino acid alterationsrelative to SEQ ID NO: 23.

In some embodiments, the D2S protein comprises one or more amino acidalterations at positions 110, 111, 112, 113, 114, 115, 116, 117, 118,119, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 261, 263,264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277,278, 279, 280, 281, 282, 457, 458, 459, 460, 461, 462, 463, 464, 466,467, or 468, or any combination thereof, of SEQ ID NO: 23 when thesequence of the D2S protein and SEQ ID NO: 23 are aligned for maximumalignment.

In some embodiments, the D2S protein comprises one or more amino acidalteration at a position corresponding to residue A110, T111, E112,M113, S114, T115, Q116, S117, L118, S119, F122, A123, T124, E125, L126,E127, T128, N129, 1130, F131, A132, K261, V263, V264, G265, V266, D267,L268, G269, 1270, N271, V272, P273, A274, Y275, V276, A277, T278, N279,1280, T281, E282, E363, 1457, A458, N459, S460, K461, D462, 1463, 1464,K466, N467, or E468, or any combination thereof of SEQ ID NO: 23. Insome cases, these amino acid alterations could be applied to CasM.19952or proteins homologous to CasM.19952 (protein homologs), wherein theprotein homologs have the same amino acid as CasM.19952 before the aminoacid is altered at that position when CasM.19952 and the protein homologare aligned for maximal alignment.

In some embodiments, the one or more amino acid alteration can be aninsertion, deletion, or substitution. In some embodiments, the one ormore amino acid alteration can be a substitution. In some embodiments,the one or more amino acid alteration can be a conservative ornon-conservative amino acid substitution. In some instances, the D2Seffector protein comprises an arginine substitution. In some instances,the alteration corresponds to an alteration shown in TABLE 9, Example18, or Example 19. In some instances, the one or more amino acidalteration is A110R, T111R, E112R, M113R, S114R, T115R, Q116R, S117R,L118R, S119R, F122R, A123R, T124R, E125R, L126R, E127R, T128R, N129R,I130R, F131R, A132R, K261R, V263R, V264R, G265R, V266R, D267R, D267A,D267N, L268R, G269R, I270R, N271R, V272R, P273R, A274R, Y275R, V276R,A277R, T278R, N279R, I280R, T281R, E282R, E363Q, I457R, A458R, N459R,S460R, K461R, D462R, I463R, I464R, K466R, N467R, or E468R of SEQ ID NO:23. In some instances, the D2S protein comprises the amino acidalteration T115R, T124R, L126R, E127R, T128R, N129R, A132R, K261R,V263R, T278R, T281R, E282R, N459R, S460R, D462R, K466R, N467R, E468R ofSEQ ID NO: 23. In some instances, the D2S protein comprises, the one ormore amino acid alteration is T124R, T128R, N129R, T278R, E282R, T281R,or any combination thereof of SEQ ID NO: 23.

When a conservative substitution is described herein, such asubstitution refers to the replacement of one amino acid for anothersuch that the replacement takes place within a family of amino acidsthat are related in their side chains. Alternatively, a non-conservativesubstitution, when described herein, refers to the replacement of oneamino acid residue for another such that the replaced residue is goingfrom one family of amino acids to a different family of residues.Genetically encoded amino acids can be divided into four families: (1)acidic (negatively charged)=Asp (D), Glu (G); (2) basic (positivelycharged)=Lys (K), Arg (R), His (H); (3) non-polar (hydrophobic)=Cys (C),Ala (A), Val (V), Leu (L), Ile (I), Pro (P), Phe (F), Met (M), Trp (W),Gly (G), Tyr (Y), with non-polar also being subdivided into: (i)strongly hydrophobic=Ala (A), Val (V), Leu (L), Ile (I), Met (M), Phe(F); and (ii) moderately hydrophobic=Gly (G), Pro (P), Cys (C), Tyr (Y),Trp (W); and (4) uncharged polar=Asn (N), Gln (Q), Ser (S), Thr (T). Inalternative fashion, the amino acid repertoire can be grouped as (1)acidic (negatively charged)=Asp (D), Glu (G); (2) basic (positivelycharged)=Lys (K), Arg (R), His (H), and (3) aliphatic=Gly (G), Ala (A),Val (V), Leu (L), Ile (I), Ser (S), Thr (T), with Ser (S) and Thr (T)optionally being grouped separately as aliphatic-hydroxyl; (4)aromatic=Phe (F), Tyr (Y), Trp (W); (5) amide=Asn (N), Glu (Q); and (6)sulfur-containing=Cys (C) and Met (M) (see, for example, Biochemistry,4th ed., Ed. by L. Stryer, WH Freeman and Co., 1995, which isincorporated by reference herein in its entirety).

In some instances, the amino acid sequence of the D2S effector protein,other than the one or more amino acid alteration as described herein, isat least 65%, at least 70%, at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, at least 97%, or at least 98%, at least 99%, or100% identical to any one of SEQ ID NOs: 241-293. In some instances, theamino acid sequence of the D2S effector protein, other than the one ormore amino acid alteration corresponding to the alteration shown inTABLE 9, is at least 65%, at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, or at least 98%, atleast 99%, or 100% identical to any one of SEQ ID NOs: 241-293.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 110,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 241. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 110, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 241.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 111,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 242. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 111, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 242.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 112,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 243. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 112, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 243.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 113,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 244. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 113, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 244.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 114,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 245. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 114, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 245.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 115,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 246. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 115, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 246.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 116,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 247. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 116, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 247.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 117,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 248. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 117, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 248.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 118,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 249. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 118, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 249.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 119,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 250. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 119, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 250.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 122,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 251. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 122, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 251.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 123,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 252. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 123, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 252.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 124,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 253. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 124, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 253.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 125,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 254. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 125, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 254.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 126,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 255. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 126, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 255.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 127,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 256. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 127, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 256.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 128,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 257. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 128, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 257.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 129,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 258. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 129, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 258.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 130,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 259. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 130, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 259.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 131,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 260. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 131, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 260.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 132,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 261. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 132, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 261.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 261,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 262. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 261, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 262.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 263,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 263. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 263, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 263.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 264,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 264. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 264, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 264.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 265,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 265. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 265, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 265.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 266,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 266. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 266, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 266.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 267,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 267. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 267, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 267.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 268,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 268. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 268, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 268.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 269,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 269. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 269, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 269.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 270,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 270. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 270, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 270.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 271,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 271. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 271, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 271.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 272,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 272. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 272, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 272.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 273,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 273. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 273, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 273.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 274,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 274. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 274, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 274.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 275,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 275. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 275, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 275.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 276,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 276. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 276, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 276.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 277,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 277. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 277, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 277.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 278,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 278. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 278, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 278.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 279,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 279. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 279, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 279.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 280,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 280. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 280, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 280.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 281,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 281. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 281, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 281.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 282,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 282. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 282, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 282.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 457,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 283. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 457, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 283.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 458,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 284. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 458, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 284.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 459,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 285. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 459, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 285.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 460,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 286. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 460, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 286.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 461,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 287. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 461, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 287.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 462,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 288. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 462, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 288.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 463,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 289. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 463, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 289.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 464,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 290. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 464, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 290.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 466,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 291. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 466, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 291.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 467,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 292. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 467, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 292.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 468,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 293. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 468, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO. 293.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 468,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 728. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 468, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO: 728.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 468,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 729. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 468, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO: 729.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 468,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 730. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 468, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO: 730.

In certain instances, compositions comprise an effector protein and anengineered guide nucleic acid, wherein the amino acid sequence of theeffector protein, other than the amino acid alteration at position 468,is at least 65%, at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, or at least 98%, at least 99%,or 100% identical to SEQ ID NO: 731. In certain instances, compositionscomprise an effector protein and an engineered guide nucleic acid,wherein the amino acid sequence of the effector protein, other than theamino acid alteration at position 468, comprises at least about 200, atleast about 220, at least about 240, at least about 260, at least about280, at least about 300, at least about 320, at least about 340, atleast about 360, at least about 380, or at least about 400 contiguousamino acids of SEQ ID NO: 731.

In some embodiments, effector proteins provided herein are a variant ofa reference polypeptide, wherein the reference polypeptide has an aminoacid sequence of SEQ ID NO: 23, and the effector protein comprises oneor more amino acid alterations. In some embodiments, effector proteinsprovided herein are a variant of a reference polypeptide, wherein thereference polypeptide has an amino acid sequence of SEQ ID NO: 23, andthe effector protein comprises one or more conservative ornon-conservative amino acid alterations. In some embodiments, effectorproteins provided herein are a variant of a reference polypeptide,wherein the reference polypeptide has an amino acid sequence of SEQ IDNO: 23, and the effector protein comprises one or more amino acidalterations comprising substitutions, deletions, insertions, or anycombination thereof. In some embodiments, effector proteins providedherein are a variant of a reference polypeptide, wherein the referencepolypeptide has an amino acid sequence of SEQ ID NO: 23, and theeffector protein comprises one or more amino acid alterations that areconservative amino acid alterations. In some embodiments, effectorproteins provided herein are a variant of a reference polypeptide,wherein the reference polypeptide has an amino acid sequence of SEQ IDNO: 23, and the effector protein comprises one or more amino acidalterations that are non-conservative amino acid alterations.

In some instances, an effector protein disclosed herein comprises anamino acid sequence that is at least about 80%, at least about 85%, atleast about 90%, at least about 95%, at least about 98%, at least about99%, or about 100% identical to SEQ ID NO: 23 and comprises at least oneamino acid alteration relative to SEQ ID NO: 23. In some instances, aneffector protein disclosed herein comprises an amino acid sequence thatis at least about 80%, at least about 85%, at least about 90%, at leastabout 95%, at least about 98%, at least about 99%, or about 100%identical to SEQ ID NO: 23 and comprises at least one conservative aminoacid alteration relative to SEQ ID NO: 23. In some instances, aneffector protein disclosed herein comprises an amino acid sequence thatis at least about 80%, at least about 85%, at least about 90%, at leastabout 95%, at least about 98%, at least about 99%, or about 100%identical to identical to SEQ ID NO: 23 and comprises at least onenon-conservative amino acid alteration relative to SEQ ID NO: 23. Insome instances, an effector protein disclosed herein comprises an aminoacid sequence that is at least about 80%, at least about 85%, at leastabout 90%, at least about 95%, at least about 98%, at least about 99%,or about 100% identical to SEQ ID NO: 23, wherein all but 1, 2, 3, 4, 5,6, 7, 8, 9 or 10 amino acids alterations relative to SEQ ID NO: 23 areconservative amino acid substitutions. In some instances, an effectorprotein disclosed herein comprises an amino acid sequence that is atleast about 80%, at least about 85%, at least about 90%, at least about95%, at least about 98%, at least about 99%, or about 100% identical toSEQ ID NO: 23, wherein all but 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 aminoacids alterations relative to SEQ ID NO: 23 are non-conservative aminoacid substitutions. In some instances, an effector protein disclosedherein comprises an amino acid sequence that is identical to SEQ ID NO:23 with the exception of 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 conservativeamino acid alterations. In some instances, an effector protein disclosedherein comprises an amino acid sequence that is identical to SEQ ID NO:23 with the exception of 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10non-conservative amino acid alterations.

In some embodiments, the D2S effector protein comprises one or moreamino acid alteration in a domain of the D2S effector protein, whereinthe D2S effector protein comprises a RuvC domain, a REC domain, or azinc finger domain, or any combination thereof. In certain embodiments,the RuvC domain comprises RuvC-I, RuvC-II, RuvC-III subdomains, or anycombination thereof. In certain embodiments, the D2S effector proteincomprises one or more amino acid alteration in a RuvC subdomain, or theREC domain. In certain embodiments, the D2S effector protein comprisesone or more amino acid alteration in the RuvC-I subdomain, the RuvC-IIsubdomain, or the REC domain. In certain embodiments, the D2S effectorprotein comprises one or more amino acid alteration in the RuvC-Isubdomain. In certain embodiments, the D2S effector protein comprisesone or more amino acid alteration in the RuvC-II subdomain. In certainembodiments, the D2S effector protein comprises one or more amino acidalteration in the REC domain. In certain embodiments, the D2S effectorprotein comprises one or more amino acid alteration in a domain of SEQID NO: 23. In certain embodiments, the D2S effector protein comprisesone or more amino acid alteration in the RuvC-I subdomain, the RuvC-IIsubdomain, or the REC domain of SEQ ID NO: 23.

In some embodiments, the D2S effector protein comprises one or moreamino acid alteration at a position corresponding to the residue at 110,111, 112, 113, 114, 115, 116, 117, 118, 119, 122, 123, 124, 125, 126,127, 128, 129, 130, 131, 132, or any combination thereof in the RECdomain of SEQ ID NO: 23. In some embodiments, the D2S effector proteincomprises one or more amino acid alteration at a position correspondingto the residue at 261, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272,273, 274, 275, 276, 277, 278, 279, 280, 281, 282, or any combinationthereof in the RuvC-I domain of SEQ ID NO: 23. In some embodiments, theD2S effector protein comprises one or more amino acid alteration at aposition corresponding to the residue at 457, 458, 459, 460, 461, 462,463, 464, 466, 467, 468, or any combination thereof in the RuvC-IIdomain of SEQ ID NO: 23. In some embodiments, the amino acid alterationis an arginine substitution.

In some embodiments, the D2S effector protein comprises one or moreamino acid alteration T115R, T124R, L126R, E127R, T128R, N129R, A132R,or any combination thereof in a REC domain of SEQ ID NO: 23. In someembodiments, the D2S effector protein comprises one or more amino acidalteration K261R, V263R, T278R, T281R, E282R, or any combination thereofin a RuvC-I domain of SEQ ID NO: 23. In some embodiments, the D2Seffector protein comprises one or more amino acid alteration N459R,S460R, D462R, K466R, N467R, E468R, or any combination thereof in aRuvC-II domain of SEQ ID NO: 23. In some embodiments, the D2S effectorprotein comprises one or more amino acid alteration at a positioncorresponding to one or more residue D267A, E363Q, or any combinationthereof. In some embodiments, the D2S effector protein comprises one ormore amino acid alteration at a position corresponding to one or moreresidue D267N, E363Q, or any combination thereof.

In some embodiments, to provide a D2S effector protein variant, a D2Seffector protein disclosed herein is selected as a template or parentsequence. Variants can be created by introducing one or more amino acidalteration (e.g., a substitution) into the template or parent sequence.The variants can be screened to identify those that have increasedactivity and/or specificity for their substrates. For example, a D2Seffector protein variant is screened to identify those alterationsleading to increased activity or specificity for the parent D2S effectorprotein's substrate or substrates.

For the purpose of amino acid position numbering, in some embodiments,SEQ ID NO: 23 is used as the reference sequence. Therefore, for example,mention of amino acid position 278 in reference to SEQ ID NO: 23, but inthe context of a variant sequence, the corresponding amino acid positionfor variant creation may have the same or different position number,(e.g., 277, 278, or 279). In some cases, the original amino acid and itsposition on the SEQ ID NO: 23 reference sequence will preciselycorrelate with the amino acid and position on the variant sequence. Inother cases, the original amino acid and its position on the SEQ ID NO:23 reference sequence will correlate with the original amino acid, butits position on the variant will not be in the corresponding templateposition. However, the corresponding amino acid on the variant can be apredetermined distance from the position on the template, such as within10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 amino acid positions from the referencetemplate position. In other cases, the original amino acid on the SEQ IDNO: 23 reference sequence will not precisely correlate with the aminoacid on the variant. However, one can understand what the correspondingamino acid on the variant sequence is based on the general location ofthe amino acid on the template and the sequence of amino acids in thevicinity of the variant amino acid.

In certain instances, a variant D2S effector protein has an increasednuclease activity as compared to the nuclease activity of thecorresponding parent sequence of SEQ ID NO: 23. In some embodiments, avariant D2S effector protein has a nuclease activity that is at least0.25 fold, at least 0.5 fold, at least 0.75 fold, at least 1 fold, atleast 1.25 fold, 1.5 fold, at least 2 fold, at least 5 fold, at least 10fold, at least 25 fold, or 0.25-25 fold as compared to the nucleaseactivity of the corresponding parent sequence of SEQ ID NO: 23.

An effector protein may be small, which may be beneficial for nucleicacid detection or editing (for example, the effector protein may be lesslikely to adsorb to a surface or another biological species due to itssmall size). The smaller nature of these effector proteins may allow forthem to be more easily packaged and delivered with higher efficiency inthe context of genome editing and more readily incorporated as a reagentin an assay. In some instances, the length of the effector protein isless than 400 amino acids. In some instances, the length of the effectorprotein is at least 368 amino acids. In some instances, the length ofthe effector protein is 368 to 378, 368 to 398, or 368 to 400 aminoacids. In some instances, the length of the effector protein is at least400 linked amino acid residues. In some instances, the length of theeffector protein is less than 500 linked amino acid residues. In someinstances, the length of the effector protein is about 400 to about 500linked amino acid residues. In some instances, the length of theeffector protein is about 380 to about 850 linked amino acid residues.In some instances, the length of the effector protein is about 300 toabout 700 linked amino acid residues. In some instances, the length ofthe effector protein is about 450 to about 550, about 330 to about 600,about 380 to about 500, about 400 to about 420, about 420 to about 440,about 440 to about 460, about 460 to about 480, about 480 to about 500,about 500 to about 520, about 520 to about 540, about 540 to about 560,about 560 to about 580, about 580 to about 600, about 600 to about 620,about 620 to about 640, about 640 to about 660, about 660 to about 680,about 680 to about 700 linked amino acids. In some cases, a linked aminoacids comprises at least two amino acids linked by an amide bond.

In some instances, the effector proteins function as an endonucleasethat catalyzes cleavage within a target nucleic acid. In some instances,the effector proteins are capable of catalyzing non-sequence-specificcleavage of a single stranded nucleic acid. In some instances, theeffector proteins (e.g., the effector proteins having SEQ ID NOs: 1-45,202-293) are activated to perform trans cleavage activity after bindingof a guide nucleic acid with a target nucleic acid. This trans cleavageactivity may also be referred to as “collateral” or “transcollateral”cleavage. Trans cleavage activity may be non-specific cleavage of nearbysingle-stranded nucleic acid by the activated effector protein, such astrans cleavage of detector nucleic acids with a detection moiety.

Effector proteins disclosed herein may function as an endonuclease thatcatalyzes cleavage at a specific position (e.g., at a specificnucleotide within a nucleic acid sequence) in a target nucleic acid. Thetarget nucleic acid may be single stranded RNA (ssRNA), double strandedDNA (dsDNA) or single-stranded DNA (ssDNA). In some instances, thetarget nucleic acid is single-stranded DNA. In some instances, thetarget nucleic acid is single-stranded RNA. The effector proteins mayprovide cis cleavage activity, trans cleavage activity, nickaseactivity, or a combination thereof. Cis cleavage activity is cleavage ofa target nucleic acid that is hybridized to a guide RNA (e.g., a dualgRNA or a sgRNA), wherein cleavage occurs within or directly adjacent tothe region of the target nucleic acid that is hybridized to guide RNA.Trans cleavage activity (also referred to as transcollateral cleavage)is cleavage of ssDNA or ssRNA that is near, but not hybridized to theguide RNA. Trans cleavage activity is triggered by the hybridization ofguide RNA to the target nucleic acid. Nickase activity is a selectivecleavage of one strand of a dsDNA. While certain effector proteins maybe used to edit and detect nucleic acids in a sequence specific manner,challenging biological sample conditions (e.g., high viscosity, metalchelating) may limit their accuracy and effectiveness. There is thus aneed for systems and methods that employ effector proteins havingspecificity and efficiency across a wide range of sample conditions.

Effector proteins of the present disclosure, dimers thereof, andmultimeric complexes thereof may cleave or nick a target nucleic acidwithin or near a protospacer adjacent motif (PAM) sequence of the targetnucleic acid. In some embodiments, a PAM is a nucleotide sequence foundin a target nucleic acid that directs an effector protein to modify thetarget nucleic acid at a specific location. In some cases, a PAMsequence may be required for a complex having an effector protein and aguide nucleic acid to hybridize to and modify the target nucleic acid.However, a given effector protein may not require a PAM sequence beingpresent in a target nucleic acid for the effector protein to modify thetarget nucleic acid. In some instances, cleavage occurs within 1, 2, 3,4, 5, 6, 7, 8, 9 or 10 nucleosides of a 5′ or 3′ terminus of a PAMsequence. A target nucleic acid may comprise a PAM sequence adjacent toa sequence that is complementary to a guide nucleic acid spacer region.In some instances, the effector protein recognizes a PAM as shown inTABLE 6. In some instances, a composition comprising an effector proteinrecognizes a PAM sequence comprising any of the following nucleotidesequences: CTT (SEQ ID NO: 154), CC (SEQ ID NO: 155), TCG (SEQ ID NO:156), GCG (SEQ ID NO: 157), TTG (SEQ ID NO: 158), GTG (SEQ ID NO: 159),ATTA (SEQ ID NO: 160), ATTG (SEQ ID NO: 161), GTTA (SEQ ID NO: 162),GTTG (SEQ ID NO: 163), TC (SEQ ID NO: 164), ACTG (SEQ ID NO: 165), GCTG(SEQ ID NO: 166), TTC (SEQ ID NO: 167), or TTT (SEQ ID NO: 168) as shownin TABLE 6. In some instances, the effector protein recognizes a PAM setforth in FIG. 1 .

In some instances, the effector protein recognizes a PAM as shown inTABLE 13. In some instances, the effector protein recognizes a PAM asshown in TABLE 14. In some instances, the effector protein recognizes aPAM as shown in TABLE 16. In some instances, the effector proteinrecognizes a PAM as shown in TABLE 17. In some instances, the effectorprotein recognizes a PAM as shown in TABLE 20. In some instances, theeffector protein recognizes a PAM as shown in TABLE 21. In someinstances, the effector protein recognizes a PAM as shown in TABLE 22.In some instances, the effector protein recognizes a PAM as shown inTABLE 23. In some instances, the PAM sequence comprises a sequencelisted in TABLE 24. In some instances, the PAM sequence comprises asequence listed in TABLE 35. In some instances, the effector proteinrecognizes a PAM set forth in FIGS. 7A-7E. In some instances, theeffector protein recognizes a PAM of SEQ ID NOs: 368, 369, 370, 371. Insome instances, the effector protein recognizes a PAM of SEQ ID NOs:304, 312, 313, 315, 324 or 335. In some instances, the effector proteinrecognizes a PAM of SEQ ID NOs: 301, 318, 335, 343, 360, or 365. In someinstances, the effector protein recognizes a PAM of SEQ ID NOs: 368. Insome instances, the effector protein recognizes a PAM of SEQ ID NOs:343. In some instances, the effector protein recognizes a PAM of SEQ IDNOs: 325, 326, 327, or 328. In some embodiments, effector proteins donot require a PAM sequence to cleave or a nick a target nucleic acid.

In some instances, the effector protein comprises six amino acidsequences selected from the group comprising: (i) an amino acid sequencethat is at least 60%, at least 70%, at least 80%, at least 90%, at least95%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 793(shown in Table 32), (ii) an amino acid sequence that is at least 60%,at least 70%, at least 80%, at least 90%, at least 95%, at least 98%, atleast 99% or 100% identical to SEQ ID NO: 794 (shown in Table 32), (iii)an amino acid sequence that is at least 60%, at least 70%, at least 80%,at least 90%, at least 95%, at least 98%, at least 99% or 100% identicalto SEQ ID NO: 795 (shown in Table 32), (iv) an amino acid sequence thatis at least 60%, at least 70%, at least 80%, at least 90%, at least 95%,at least 98%, at least 99% or 100% identical to SEQ ID NO: 796 (shown inTable 32), (v) an amino acid sequence that is at least 60%, at least70%, at least 80%, at least 90%, at least 95%, at least 98%, at least99% or 100% identical to SEQ ID NO: 797 (shown in Table 32), (vi) anamino acid sequence that is at least 60%, at least 70%, at least 80%, atleast 90%, at least 95%, at least 98%, at least 99% or 100% identical toSEQ ID NO: 798 (shown in Table 32), and (vii) an amino acid sequencethat is at least 60%, at least 70%, at least 80%, at least 90%, at least95%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 799(shown in Table 32).

MEME_1 to MEME_7 are PROSITE motifs, a format which is routinely used inthe art to describe a consensus sequence. For example, the PROSITEsequence [NH]AD corresponds to the sequences NAD and HAD. When an aminoacid sequence is analysed to calculate the degree of identity to thePROSITE sequence [NH]AD, both NAD and HAD are given equal weight. Inother words, both NAD and HAD share 100% identity with the PROSITE motif[NH]AD.

In some instances, the effector protein comprises seven amino acidsequences selected from the group: (i) an amino acid sequence that is atleast 40%, at least 50%, at least 60%, at least 70%, at least 80%, atleast 90%, at least 95%, at least 98%, at least 99% or 100% identical toSEQ ID NO: 793, (ii) an amino acid sequence that is at least 40%, atleast 50%, at least 60%, at least 70%, at least 80%, at least 90%, atleast 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:794, (iii) an amino acid sequence that is at least 40%, at least 50%, atleast 60%, at least 70%, at least 80%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 795, (iv) anamino acid sequence that is at least 40%, at least 50%, at least 60%, atleast 70%, at least 80%, at least 90%, at least 95%, at least 98%, atleast 99% or 100% identical to SEQ ID NO: 796, (v) an amino acidsequence that is at least 40%, at least 50%, at least 60%, at least 70%,at least 80%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 797, (vi) an amino acid sequence that is atleast 40%, at least 50%, at least 60%, at least 70%, at least 80%, atleast 90%, at least 95%, at least 98%, at least 99% or 100% identical toSEQ ID NO: 798, and (vii) an amino acid sequence that is at least 40%,at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, atleast 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:799.

In preferred embodiments, the effector protein comprises six amino acidsequences selected from the group: (i) an amino acid sequence that is atleast 69.5% identical to SEQ ID NO: 793, (ii) an amino acid sequencethat is at least 69.5% identical to SEQ ID NO: 794, (iii) an amino acidsequence that is at least 69.5% identical to SEQ ID NO: 795, (iv) anamino acid sequence that is at least 69.5% identical to SEQ ID NO: 796,(v) an amino acid sequence that is at least 69.5% identical to SEQ IDNO: 797, (vi) an amino acid sequence that is at least 69.5% identical toSEQ ID NO: 798, and (vii) an amino acid sequence that is at least 69.5%identical to SEQ ID NO: 799. In further preferred embodiments, theeffector protein comprises six amino acid sequences selected from thegroup: (i) an amino acid sequence that is at least 80% identical to SEQID NO: 793, (ii) an amino acid sequence that is at least 80% identicalto SEQ ID NO: 794, (iii) an amino acid sequence that is at least 80%identical to SEQ ID NO: 795, (iv) an amino acid sequence that is atleast 80% identical to SEQ ID NO: 796, (v) an amino acid sequence thatis at least 80% identical to SEQ ID NO: 797, (vi) an amino acid sequencethat is at least 80% identical to SEQ ID NO: 798, and (vii) an aminoacid sequence that is at least 80% identical to SEQ ID NO: 799.

In some instances, the effector protein comprises an amino acid sequencethat is (1) at least 50%, at least 60%, at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 98%, atleast 99% or 100% identical to SEQ ID NO: 23, preferably at least 68%identical to SEQ ID NO: 23, and (2) includes six amino acid sequencesselected from the group: (i) an amino acid sequence that is at 69.5%identical to SEQ ID NO: 793, (ii) an amino acid sequence that is at69.5% identical to SEQ ID NO: 794, (iii) an amino acid sequence that isat 69.5% identical to SEQ ID NO: 795, (iv) an amino acid sequence thatis at 69.5% identical to SEQ ID NO: 796, (v) an amino acid sequence thatis at 69.5% identical to SEQ ID NO: 797, (vi) an amino acid sequencethat is at 69.5% identical to SEQ ID NO: 798, and (vii) an amino acidsequence that is at 69.5% identical to SEQ ID NO: 799.

In some instances, the effector protein comprises an amino acid sequencethat is (1) at least 50%, at least 60%, at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 98%, atleast 99% or 100% identical to SEQ ID NO: 23 and (2) includes six aminoacid sequences selected from the group comprising: (i) an amino acidsequence that is at least 60%, at least 70%, at least 80%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:793, (ii) an amino acid sequence that is at least 60%, at least 70%, atleast 80%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 794, (iii) an amino acid sequence that isat least 60%, at least 70%, at least 80%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 795, (iv) anamino acid sequence that is at least 60%, at least 70%, at least 80%, atleast 90%, at least 95%, at least 98%, at least 99% or 100% identical toSEQ ID NO: 796, (v) an amino acid sequence that is at least 60%, atleast 70%, at least 80%, at least 90%, at least 95%, at least 98%, atleast 99% or 100% identical to SEQ ID NO: 797, (vi) an amino acidsequence that is at least 60%, at least 70%, at least 80%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:798, and (vii) an amino acid sequence that is at least 60%, at least70%, at least 80%, at least 90%, at least 95%, at least 98%, at least99% or 100% identical to SEQ ID NO: 799.

In some preferred embodiments, the effector protein comprises an aminoacid sequence that is (1) at least 68% identical to SEQ ID NO:23, and(2) includes six amino acid sequences selected from the group: (i) anamino acid sequence that is at least 69.5% identical to SEQ ID NO: 793,(ii) an amino acid sequence that is at least 69.5% identical to SEQ IDNO: 794, (iii) an amino acid sequence that is at least 69.5% identicalto SEQ ID NO: 795, (iv) an amino acid sequence that is at least 69.5%identical to SEQ ID NO: 796, (v) an amino acid sequence that is at least69.5% identical to SEQ ID NO: 797, (vi) an amino acid sequence that isat least 69.5% identical to SEQ ID NO: 798, and (vii) an amino acidsequence that is at least 69.5% identical to SEQ ID NO: 799.

In some instances, the effector protein comprises an amino acid sequencethat is at least 37%, at least 40%, at least 50%, at least 60%, at least70%, at least 80%, at least 90%, at least 95%, at least 98%, at least99% or 100% identical to SEQ ID NO: 796.

In some instances, the effector protein comprises (1) an amino acidsequence that is at least 37%, at least 40%, at least 50%, at least 60%,at least 70%, at least 80%, at least 90%, at least 95%, at least 98%, atleast 99% or 100% identical to SEQ ID NO: 796, and (2) four amino acidsequences selected from the group: (i) an amino acid sequence that is atleast 60%, at least 70%, at least 80%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 793, (ii) anamino acid sequence that is at least 60%, at least 70%, at least 80%, atleast 90%, at least 95%, at least 98%, at least 99% or 100% identical toSEQ ID NO: 794, (iii) an amino acid sequence that is at least 60%, atleast 70%, at least 80%, at least 90%, at least 95%, at least 98%, atleast 99% or 100% identical to SEQ ID NO: 795, (iv) an amino acidsequence that is at least 60%, at least 70%, at least 80%, at least 90%,at least 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:797, (v) an amino acid sequence that is at least 60%, at least 70%, atleast 80%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 798, and (vi) an amino acid sequence thatis at least 60%, at least 70%, at least 80%, at least 90%, at least 95%,at least 98%, at least 99% or 100% identical to SEQ ID NO: 799. In somefurther instances, the effector protein comprises an amino acid sequencethat is at least 50%, at least 60%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 98%, at least99% or 100% identical to SEQ ID NO: 23, preferably wherein the aminoacid sequence is at least 68% identical to SEQ ID NO:23.

In some instances, the effector protein comprises (1) an amino acidsequence that is at least 37%, at least 40%, at least 50%, at least 60%,at least 70%, at least 80%, at least 90%, at least 95%, at least 98%, atleast 99% or 100% identical to SEQ ID NO: 796, and (2) four amino acidsequences selected from the group: (i) an amino acid sequence that is atleast 69.5% identical to SEQ ID NO: 793, (ii) an amino acid sequencethat is at least 69.5% identical to SEQ ID NO: 794, (iii) an amino acidsequence that is at least 69.5% identical to SEQ ID NO: 795, (iv) anamino acid sequence that is at least 69.5% identical to SEQ ID NO: 797,(v) an amino acid sequence that is at least 69.5% identical to SEQ IDNO: 798, and (vi) an amino acid sequence that is at least 69.5%identical to SEQ ID NO: 799. In some further instances, the effectorprotein comprises an amino acid sequence that is at least 50%, at least60%, at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 98%, at least 99% or 100% identical to SEQID NO: 23, preferably wherein the amino acid sequence is at least 68%identical to SEQ ID NO:23.

In some instances, the effector protein comprises one or more of: (i) anamino acid sequence that is at least 80%, preferably at least 90%,identical to SEQ ID NO: 793, (ii) an amino acid sequence that is atleast 80%, preferably at least 90%, identical to SEQ ID NO: 794, (iii)an amino acid sequence that is at least 80%, preferably at least 90%,identical to SEQ ID NO: 795, (iv) an amino acid sequence that is atleast 80%, preferably at least 90%, identical to SEQ ID NO: 796, (v) anamino acid sequence that is at least 80%, preferably at least 90%,identical to SEQ ID NO: 797, (vi) an amino acid sequence that is atleast 80%, preferably at least 90%, identical to SEQ ID NO: 798, and(vii) an amino acid sequence that is at least 80%, preferably at least90%, identical to SEQ ID NO: 799.

In some instances, the effector proteins comprises amino acid sequencesthat have at least a threshold identity referred to herein to any one ofSEQ ID NO: 793 to SEQ ID NO: 799 and the amino acid sequences are in thefollowing order, starting from the N-terminus: (i) the sequence havingat least the threshold identity with SEQ ID NO: 796, (ii) the sequencehaving at least the threshold identity with SEQ ID NO: 797, (iii) thesequence having at least the threshold identity with SEQ ID NO: 795,(iv) the sequence having at least the threshold identity with SEQ ID NO:799, (v) the sequence having at least the threshold identity with SEQ IDNO: 794, (vi) the sequence having at least the threshold identity withSEQ ID NO: 793, and (vii) the sequence having at least the thresholdidentity with SEQ ID NO: 798. In some instances, the effector proteindoes not include an amino acid that meets a specified degree of identity(i.e. the threshold identity) with any one of SEQ ID NO: 793 to SEQ IDNO: 799. For example, in some instances, the effector protein does notinclude an amino acid sequence having 36.5% or more identity with SEQ IDNO: 796, and the effector protein comprises, distributed through theprotein starting from the N-terminus, (i) a sequence having at least thethreshold identity with SEQ ID NO: 797, (ii) a sequence having at leastthe threshold identity with SEQ ID NO: 795, (iii) a sequence having atleast the threshold identity with SEQ ID NO: 799, (iv) a sequence havingat least the threshold identity with SEQ ID NO: 794, (v) a sequencehaving at least the threshold identity with SEQ ID NO: 793, and (vi) asequence having at least the threshold identity with SEQ ID NO: 798.

In some instances, effector proteins have been modified. In someembodiments, D2S effector proteins disclosed herein or a variant thereofmay comprise an NLS. In some cases, an NLS comprises an entity (e.g.,peptide) that facilitates localization of a nucleic acid, protein, orsmall molecule to the nucleus, when present in a cell that contains anuclear compartment. An NLS can be located at or near the amino terminus(N-terminus) of the D2S effector proteins disclosed herein. An NLS canbe located at or near the carboxy terminus (C-terminus) of the D2Seffector proteins disclosed herein. In some embodiments, a vectorencodes the D2S effector proteins described herein, wherein the vectoror vector systems disclosed herein comprises one or more NLSs, such asabout or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs. Insome embodiments, a D2S effector protein described herein comprises 1,2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the N-terminus, 1,2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the C-terminus, or acombination of these (e.g. one or more NLS at the amino-terminus and oneor more NLS at the carboxy terminus). When more than one NLS is present,each may be selected independently of the others, such that a single NLSmay be present in more than one copy and/or in combination with one ormore other NLSs present in one or more copies. In some embodiments, anNLS is considered near the N- or C-terminus when the nearest amino acidof the NLS is within about 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50, ormore amino acids along the polypeptide chain from the N- or C-terminus.

In certain embodiments, the nucleotide sequence encoding the effectorprotein is codon optimized (e.g., for expression in a eukaryotic cell)relative to the naturally occurring sequence. In some embodiments, D2Seffector proteins described herein are encoded by a codon optimizednucleic acid. In some embodiments, a nucleic acid sequence encoding aD2S effector protein described herein is codon optimized. This type ofoptimization can entail a mutation of a D2S effector protein encodingnucleotide sequence to mimic the codon preferences of the intended hostorganism or cell while encoding the same polypeptide. Thus, the codonscan be changed, but the encoded protein remains unchanged. For example,if the intended target cell was a human cell, a human codon-optimizedD2S effector protein-encoding nucleotide sequence could be used. Asanother non-limiting example, if the intended host cell were a mousecell, then a mouse codon-optimized D2S effector protein-encodingnucleotide sequence could be generated. As another non-limiting example,if the intended host cell were a eukaryotic cell, then a eukaryotecodon-optimized D2S effector protein nucleotide sequence could begenerated. As another non-limiting example, if the intended host cellwere a prokaryotic cell, then a prokaryote codon-optimized D2S effectorprotein-encoding nucleotide sequence could be generated. Codon usagetables are readily available, for example, at the “Codon Usage Database”available at www.kazusa.or.jp/codon. Effector proteins may be codonoptimized for expression in a specific cell, for example, a bacterialcell, a plant cell, a eukaryotic cell, an animal cell, a mammalian cell,or a human cell. In some embodiments, the effector protein is codonoptimized for a human cell.

It is understood that when describing coding sequences of polypeptidesdescribed herein, said coding sequences do not necessarily require acodon encoding a N-terminal Methionine (M) or a Valine (V) as describedfor the D2S effector proteins described herein. One skilled in the artwould understand that a start codon could be replaced or substitutedwith a start codon that encodes for an amino acid residue sufficient forinitiating translation in a host cell. In some instances, when amodifying heterologous peptide, such as a fusion protein partner islocated at the N terminus of the effector protein, a start codon for thefusion protein partner serves as a start codon for the effector proteinas well. Thus, the natural start codon encoding an amino acid residuesufficient for initiating translation (e.g., Methionine (M) or a Valine(V)) of the effector protein may be removed or absent.

In some cases, compositions comprise a D2S effector protein and a cell.In some embodiments, compositions comprise a cell that expresses a D2Seffector protein. In some cases, compositions comprise a nucleic acidencoding a D2S effector protein and a cell. In some embodiments,compositions comprise a cell expressing a nucleic acid encoding a D2Seffector protein. In some instances, the cell is a prokaryotic cell. Insome instances, the cell is a eukaryotic cell. In some instances, thecell is a mammalian cell.

D2S effector proteins of the present disclosure may be produced in vitroor by eukaryotic cells or by prokaryotic cells. D2S effector proteinscan be further processed by unfolding, e.g. heat denaturation,dithiothreitol reduction, etc. and may be further refolded, using anysuitable method. D2S effector proteins of the present disclosure of thepresent disclosure may be synthesized, using any suitable method.

In some embodiments, D2S effector proteins described herein can beisolated and purified for use in compositions, systems, and/or methodsdescribed herein. Methods described here can include the step ofisolating D2S effector proteins described herein. Compositions and/orsystems described herein can further comprise a purification tag thatcan be attached to a D2S effector protein, or a nucleic acid encodingfor a purification tag that can be attached to a nucleic acid encodingfor a D2S effector protein as described herein. A purification tag, asused herein, can be an amino acid sequence which can attach or bind withhigh affinity to a separation substrate and assist in isolating theprotein of interest from its environment, which can be its biologicalsource, such as a cell lysate. Attachment of the purification tag can beat the N or C terminus of the D2S effector protein. In some instanceswhen a purification tag located at the N terminus of the effectorprotein, a start codon for the purification tag serves as a start codonfor the effector protein as well. Thus, the natural start codon of theeffector protein may be removed or absent. Furthermore, an amino acidsequence recognized by a protease or a nucleic acid encoding for anamino acid sequence recognized by a protease, such as TEV protease orthe HRV3C protease can be inserted between the purification tag and theD2S effector protein, such that biochemical cleavage of the sequencewith the protease after initial purification liberates the purificationtag. Purification and/or isolation can be through high performanceliquid chromatography (HPLC), exclusion chromatography, gelelectrophoresis, affinity chromatography, or other purificationtechnique. Non-limiting examples of purification tags include ahistidine tag, e.g., a 6×His tag; a hemagglutinin (HA) tag; a FLAG tag;a Myc tag; and maltose binding protein (MBP). In some embodiments, aneffector protein is fused or linked (e.g., via an amide bond) to afluorescent protein. Non-limiting examples of fluorescent proteinsinclude green fluorescent protein (GFP), yellow fluorescent protein(YFP), red fluorescent protein (RFP), cyan fluorescent protein (CFP),mCherry, and tdTomato.

For example, in some embodiments, D2S effector proteins described hereinare isolated from cell lysate. In some embodiments, the compositionsdescribed herein can comprise 20% or more by weight, 75% or more byweight, 95% or more by weight, or 99.5% or more by weight of a D2Seffector protein, related to the method of preparation of compositionsdescribed herein and its purification thereof, wherein percentages canbe upon total protein content in relation to contaminants. Thus, in somecases, a D2S effector protein described herein is at least 80% pure, atleast 85% pure, at least 90% pure, at least 95% pure, at least 98% pure,or at least 99% pure (e.g., free of contaminants, non-engineeredpolypeptide proteins or other macromolecules, etc.).

Engineered Proteins

In some instances, effector proteins disclosed herein are engineeredproteins. Engineered proteins are not identical to a naturally-occurringprotein. Such an engineered protein can include one or more mutations,including an insertion, deletion or substitution (e.g., conservative ornon-conservative substitution). An engineered protein, in someembodiments, includes at least one mutation relative to a referenceprotein (e.g., a naturally-occurring protein). In some embodiments, anengineered protein includes at least 1, at least 2, at least 3, at least4, at least 5, at least 6, at least 7, at least 8, at least 9, at least10, at least 15, at least 20, at least 25 or at least 30 mutationsrelative to a reference protein (e.g., a naturally-occurring protein).In some embodiments, an engineered protein includes no more than 10, 20,30, 40, or 50 mutations relative to a reference protein (e.g., anaturally-occurring protein). Engineered proteins may provide enhancednuclease or nickase activity as compared to a naturally occurringnuclease or nickase. By way of non-limiting example, some engineeredproteins exhibit optimal activity at lower salinity and viscosity thanthe protoplasm of their bacterial cell of origin. Also, by way ofnon-limiting example, bacteria often comprise protoplasmic saltconcentrations greater than 250 mM and room temperature intracellularviscosities above 2 centipoise, whereas engineered proteins exhibitoptimal activity (e.g., cis-cleavage activity) at salt concentrationsbelow 150 mM and viscosities below 1.5 centipoise. The presentdisclosure leverages these dependencies by providing engineered proteinsin solutions optimized for their activity and stability.

Compositions and systems described herein may comprise an engineeredeffector protein in a solution comprising a room temperature viscosityof less than about 15 centipoise, less than about 12 centipoise, lessthan about 10 centipoise, less than about 8 centipoise, less than about6 centipoise, less than about 5 centipoise, less than about 4centipoise, less than about 3 centipoise, less than about 2 centipoise,or less than about 1.5 centipoise.

Compositions and systems may comprise an engineered effector protein ina solution comprising an ionic strength of less than about 500 mM, lessthan about 400 mM, less than about 300 mM, less than about 250 mM, lessthan about 200 mM, less than about 150 mM, less than about 100 mM, lessthan about 80 mM, less than about 60 mM, or less than about 50 mM.Compositions and systems may comprise an engineered effector protein andan assay excipient, which may stabilize a reagent or product, preventaggregation or precipitation, or enhance or stabilize a detectablesignal (e.g., a fluorescent signal). Examples of assay excipientsinclude, but are not limited to, saccharides and saccharide derivatives(e.g., sodium carboxymethyl cellulose and cellulose acetate),detergents, glycols, polyols, esters, buffering agents, alginic acid,and organic solvents (e.g., DMSO).

An engineered protein may comprise a modified form of a wild typecounterpart protein (e.g., a D2S effector protein). The modified form ofthe wild type counterpart may comprise an amino acid change (e.g.,deletion, insertion, or substitution) that reduces the nucleicacid-cleaving activity of the effector protein relative to the wild typecounterpart. For example, a nuclease domain (e.g., RuvC domain) of a D2Seffector protein may be deleted or mutated relative to a wild typecounterpart D2S effector protein so that it is no longer functional orcomprises reduced nuclease activity. The modified form of the effectorprotein may have less than 90%, less than 80%, less than 70%, less than60%, less than 50%, less than 40%, less than 30%, less than 20%, lessthan 10%, less than 5%, or less than 1% of the nucleic acid-cleavingactivity of the wild-type counterpart. Engineered proteins may have nosubstantial nucleic acid-cleaving activity. Engineered proteins may beenzymatically inactive or “dead,” that is it may bind to a nucleic acidbut not cleave it. An enzymatically inactive protein may comprise anenzymatically inactive domain (e.g. inactive nuclease domain).Enzymatically inactive may refer to an activity less than 1%, less than2%, less than 3%, less than 4%, less than 5%, less than 6%, less than7%, less than 8%, less than 9%, or less than 10% activity compared tothe wild-type counterpart. A dead protein may associate with a guidenucleic acid to activate or repress transcription of a target nucleicacid sequence. In some instances, the enzymatically inactive protein isfused with a protein comprising recombinase activity.

Fusion Proteins

In some instances, an effector protein is a fusion protein, wherein thefusion protein comprises a D2S effector protein and a fusion partnerprotein. In some instances, the D2S effector protein comprises an aminoacid sequence that is at least 70%, at least 75%, at least 80%, at least85%, at least 90%, at least 95%, or 100% identical to any one of SEQ IDNOs: 1-5. In some instances the amino acid of the D2S effector proteinis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to any one of SEQ ID NOs: 1-45. In someinstances the amino acid of the D2S effector protein is at least 70%, atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to any one of SEQ ID NOs: 1-45, 202-293, or 728-731. Insome instances the amino acid of the D2S effector protein is at least70%, at least 75%, at least 80%, at least 85%, at least 90%, at least95%, at least 99%, or 100% identical to any one of SEQ ID NOs: 728-731.Unless otherwise indicated, reference to effector proteins throughoutthe present disclosure include fusion proteins thereof.

In some embodiments, a fusion effector protein, fusion protein, andfusion polypeptide, comprise a protein comprising at least twoheterologous polypeptides. Often a fusion effector protein comprises aneffector protein and a fusion partner protein. In general, the fusionpartner protein is not an effector protein.

In some embodiments, a fusion partner protein or a fusion partnercomprise a protein, polypeptide or peptide that is fused to an effectorprotein. The fusion partner generally imparts some function to thefusion protein that is not provided by the effector protein. The fusionpartner may provide a detectable signal. The fusion partner may modify atarget nucleic acid, including changing a nucleobase of the targetnucleic acid and making a chemical modification to one or morenucleotides of the target nucleic acid. The fusion partner may becapable of modulating the expression of a target nucleic acid. Thefusion partner may inhibit, reduce, activate or increase expression of atarget nucleic acid via additional proteins or nucleic acidmodifications to the target sequence.

A fusion partner protein is also simply referred to herein as a fusionpartner. In some instances, the fusion partner promotes the formation ofa multimeric complex of the D2S effector protein. In some instances, thefusion partner inhibits the formation of a multimeric complex of the D2Seffector protein. By way of non-limiting example, the fusion protein maycomprise a D2S effector protein and a fusion partner comprising aCalcineurin A tag, wherein the fusion protein dimerizes in the presenceof Tacrolimus (FK506). Also by way of non-limiting example, the fusionprotein may comprise a D2S effector protein and a SpyTag configured todimerize or associate with another effector protein in a multimericcomplex.

In some instances, the fusion partner is fused to the N-terminus of theeffector protein. In some instances, the fusion partner is fused to theC-terminus of the effector protein. The terms “fused” and “linked” areinterchangeable.

In some instances, more than one fusion partner is fused to the effectorprotein. In some instances, a further fusion partner is fused to a firstfusion partner that is fused to the effector protein.

In some instances, the fusion partner modulates transcription (e.g.,inhibits transcription, increases transcription) of a target nucleicacid. In some instances, the fusion partner is a protein (or a domainfrom a protein) that inhibits transcription, also referred to as atranscriptional repressor. Transcriptional repressors may inhibittranscription via recruitment of transcription inhibitor proteins,modification of target DNA such as methylation, recruitment of a DNAmodifier, modulation of histones associated with target DNA, recruitmentof a histone modifier such as those that modify acetylation and/ormethylation of histones, or a combination thereof. In some instances,the fusion partner is a protein (or a domain from a protein) thatincreases transcription, also referred to as a transcription activator.Transcriptional activators may promote transcription via recruitment oftranscription activator proteins, modification of target DNA such asdemethylation, recruitment of a DNA modifier, modulation of histonesassociated with target DNA, recruitment of a histone modifier such asthose that modify acetylation and/or methylation of histones, or acombination thereof. In some instances, the fusion partner is a reversetranscriptase.

In some instances, the fusion partner is a base editor. In general, abase editor comprises a deaminase that when fused with a D2S proteinchanges a nucleobase to a different nucleobase, e.g., cytosine tothymine or guanine to adenine. In some instances, the base editorcomprises a deaminase.

In some instances, fusion partners provide enzymatic activity thatmodifies a target nucleic acid. Such enzymatic activities include, butare not limited to, nuclease activity, methyltransferase activity,demethylase activity, DNA repair activity, DNA damage activity,deamination activity, dismutase activity, alkylation activity,depurination activity, oxidation activity, pyrimidine dimer formingactivity, integrase activity, transposase activity, recombinaseactivity, polymerase activity, ligase activity, helicase activity,photolyase activity, and glycosylase activity.

Modifying Target Nucleic Acids

In some instances, fusion partners have enzymatic activity that modifiesthe target nucleic acid. The target nucleic acid may comprise or consistof a ssRNA, dsRNA, ssDNA, or a dsDNA. Examples of enzymatic activitythat modifies the target nucleic acid include, but are not limited to:nuclease activity such as that provided by a restriction enzyme (e.g.,FokI nuclease); methyltransferase activity such as that provided by amethyltransferase (e.g., HhaI DNA m5c-methyltransferase (M.HhaI), DNAmethyltransferase 1 (DNMT1), DNA methyltransferase 3a (DNMT3a), DNAmethyltransferase 3b (DNMT3b), METI, DRM3 (plants), ZMET2, CMT1, CMT2(plants)); demethylase activity such as that provided by a demethylase(e.g., Ten-Eleven Translocation (TET) dioxygenase 1 (TET1CD), TET1, DME,DML1, DML2, ROS1); DNA repair activity; DNA damage (e.g., oxygenation)activity; deamination activity such as that provided by a deaminase(e.g., a cytosine deaminase enzyme such as rat APOBEC1); dismutaseactivity; alkylation activity; depurination activity; oxidationactivity; pyrimidine dimer forming activity; integrase activity such asthat provided by an integrase and/or resolvase (e.g., Gin invertase suchas the hyperactive mutant of the Gin invertase, GinH106Y; humanimmunodeficiency virus type 1 integrase (IN); Tn3 resolvase);transposase activity, recombinase activity such as that provided by arecombinase (e.g., catalytic domain of Gin recombinase); as well aspolymerase activity, ligase activity, helicase activity, photolyaseactivity, and glycosylase activity.

Non-limiting examples of fusion partners for targeting ssRNA include,but are not limited to, splicing factors (e.g., RS domains); proteintranslation components (e.g., translation initiation, elongation, and/orrelease factors; e.g., eIF4G); RNA methylases; RNA editing enzymes(e.g., RNA deaminases, e.g., adenosine deaminase acting on RNA (ADAR),including A to I and/or C to U editing enzymes); helicases; andRNA-binding proteins. It is understood that a fusion protein may includethe entire protein or in some instances may include a fragment of theprotein (e.g., a functional domain). In some instances, the functionaldomain interacts with or binds ssRNA, including intramolecular and/orintermolecular secondary structures thereof, e.g., hairpins, stem-loops,etc.). In some embodiments, a functional domain comprises a region ofone or more amino acids in a protein that is required for an activity ofthe protein, or the full extent of that activity, as measured in an invitro assay. Activities include, but are not limited to nucleic acidbinding, nucleic acid modification, nucleic acid cleavage, proteinbinding. The absence of the functional domain, including mutations ofthe functional domain, would abolish or reduce activity. The functionaldomain may interact transiently or irreversibly, directly or indirectly.Fusion proteins may comprise a protein or domain thereof selected from:endonucleases (e.g., RNase III, the CRR22 DYW domain, Dicer, and PIN(PilT N-terminus); SMG5 and SMG6; domains responsible for stimulatingRNA cleavage (e.g., CPSF, CstF, CFIm and CFIIm); exonucleases such asXRN-1 or Exonuclease T; deadenylases such as HNT3; protein domainsresponsible for nonsense mediated RNA decay (e.g., UPF1, UPF2, UPF3,UPF3b, RNP 51, Y14, DEK, REF2, and SRm160); protein domains responsiblefor stabilizing RNA (e.g., PABP); proteins and protein domainsresponsible for repressing translation (e.g., Ago2 and Ago4); proteinsand protein domains responsible for stimulating translation (e.g.,Staufen); proteins and protein domains responsible for (e.g., capableof) modulating translation (e.g., translation factors such as initiationfactors, elongation factors, release factors, etc., e.g., eIF4G);proteins and protein domains responsible for polyadenylation of RNA(e.g., PAP1, GLD-2, and Star-PAP); proteins and protein domainsresponsible for polyuridinylation of RNA (e.g., CI D1 and terminaluridylate transferase); proteins and protein domains responsible for RNAlocalization (e.g., from IMP1, ZBP1, She2p, She3p, and Bicaudal-D);proteins and protein domains responsible for nuclear retention of RNA(e.g., Rrp6); proteins and protein domains responsible for nuclearexport of RNA (e.g., TAP, NXF1, THO, TREX, REF, and Aly); proteins andprotein domains responsible for repression of RNA splicing (e.g., PTB,Sam68, and hnRNP A1); proteins and protein domains responsible forstimulation of RNA splicing (e.g., Serine/Arginine-rich (SR) domains);proteins and protein domains responsible for reducing the efficiency oftranscription (e.g., FUS (TLS)); and proteins and protein domainsresponsible for stimulating transcription (e.g., CDK7 and HIV Tat).Alternatively, the effector domain may be a domain of a protein selectedfrom the group comprising endonucleases; proteins and protein domainscapable of stimulating RNA cleavage; exonucleases; deadenylases;proteins and protein domains having nonsense mediated RNA decayactivity; proteins and protein domains capable of stabilizing RNA;proteins and protein domains capable of repressing translation; proteinsand protein domains capable of stimulating translation; proteins andprotein domains capable of modulating translation (e.g., translationfactors such as initiation factors, elongation factors, release factors,etc., e.g., eIF4G); proteins and protein domains capable ofpolyadenylation of RNA; proteins and protein domains capable ofpolyuridinylation of RNA; proteins and protein domains having RNAlocalization activity; proteins and protein domains capable of nuclearretention of RNA; proteins and protein domains having RNA nuclear exportactivity; proteins and protein domains capable of repression of RNAsplicing; proteins and protein domains capable of stimulation of RNAsplicing; proteins and protein domains capable of reducing theefficiency of transcription; and proteins and protein domains capable ofstimulating transcription. Another suitable fusion partner is a PUFRNA-binding domain, which is described in more detail in WO2012068627,which is hereby incorporated by reference in its entirety.

In some instances, the fusion partner comprises an RNA splicing factor.The RNA splicing factor may be used (in whole or as fragments thereof)for modular organization, with separate sequence-specific RNA bindingmodules and splicing effector domains. Non-limiting examples of RNAsplicing factors include members of the Serine/Arginine-rich (SR)protein family contain N-terminal RNA recognition motifs (RRMs) thatbind to exonic splicing enhancers (ESEs) in pre-mRNAs and C-terminal RSdomains that promote exon inclusion. As another example, the hnRNPprotein hnRNP A1 binds to exonic splicing silencers (ESSs) through itsRRM domains and inhibits exon inclusion through a C-terminalGlycine-rich domain. Some splicing factors may regulate alternative useof splice site (ss) by binding to regulatory sequences between the twoalternative sites. For example, ASF/SF2 may recognize ESEs and promotethe use of intron proximal sites, whereas hnRNP A1 may bind to ESSs andshift splicing towards the use of intron distal sites. One applicationfor such factors is to generate ESFs that modulate alternative splicingof endogenous genes, particularly disease associated genes. For example,Bcl-x pre-mRNA produces two splicing isoforms with two alternative 5′splice sites to encode proteins of opposite functions. The long splicingisoform Bcl-xL is a potent apoptosis inhibitor expressed in long-livedpostmitotic cells and is up-regulated in many cancer cells, protectingcells against apoptotic signals. The short isoform Bcl-xS is apro-apoptotic isoform and expressed at high levels in cells with a highturnover rate (e.g., developing lymphocytes). The ratio of the two Bcl-xsplicing isoforms is regulated by multiple c{acute over (ω)}-elementsthat are located in either the core exon region or the exon extensionregion (i.e., between the two alternative 5′ splice sites). For moreexamples, see WO2010075303, which is hereby incorporated by reference inits entirety.

In some instances, fusion proteins are targeted by a guide nucleic acid(guide RNA) to a specific location in the target nucleic acid and exertlocus-specific regulation such as blocking RNA polymerase binding to apromoter (which selectively inhibits transcription activator function),and/or modifying the local chromatin status (e.g., when a fusionsequence is used that modifies the target nucleic acid or modifies aprotein associated with the target nucleic acid). In some instances, themodifications are transient (e.g., transcription repression oractivation). In some instances, the modifications are inheritable. Forinstance, epigenetic modifications made to a target nucleic acid, or toproteins associated with the target nucleic acid, e.g., nucleosomalhistones, in a cell, are observed in cells produced by proliferation ofthe cell.

CRISPRa Fusions and CRISPRi Fusions

In some instances, fusion partners include, but are not limited to, aprotein that directly and/or indirectly provides for increased ordecreased transcription and/or translation of a target nucleic acid(e.g., a transcription activator or a fragment thereof, a protein orfragment thereof that recruits a transcription activator, a smallmolecule/drug-responsive transcription and/or translation regulator, atranslation-regulating protein, etc.). In some instances, fusionpartners that increase or decrease transcription include a transcriptionactivator domain or a transcription repressor domain, respectively.

In some embodiments, fusion partners activate or increase expression ofa target nucleic acid. Fusion proteins comprising such fusion partnersand a Cas effector protein may be referred to as CRISPRa fusions. Insome embodiments, fusion partners increase expression of the targetnucleic acid relative to its expression in the absence of the fusionprotein. Relative expression, including transcription and RNA levels,may be assessed, quantified, and compared, e.g., by RT-qPCR. In someembodiments, fusion partners comprise a transcriptional activator.Transcriptional activators may promote transcription via: recruitment ofother transcription factor proteins; modification of target DNA such asdemethylation; recruitment of a DNA modifier; modulation of histonesassociated with target DNA; recruitment of a histone modifier such asthose that modify acetylation and/or methylation of histones; or acombination thereof.

In some cases, a fusion partner that promotes or increases transcriptionis VPR. In some embodiments, VPR can be fused to a catalyticallyinactive effector protein. In some embodiments, the amino acid sequenceof VPR is DALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLINSRSSGSPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGSAISDVFEGREVCQPKRIRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHISTGLSIFDTSLF (SEQ ID NO: 300). In some embodiments, a fusionpartner protein comprises an amino acid sequence that is at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, atleast 99% or 100% identical to SEQ ID NO: 300.

Non-limiting examples of fusion partners that promote or increasetranscription include, but are not limited to: transcriptionalactivators such as VP16, VP64, VP48, VP160, p65 subdomain (e.g., fromNFkB), and activation domain of EDLL and/or TAL activation domain (e.g.,for activity in plants); histone lysine methyltransferases such asSET1A, SET1B, MLL1 to 5, ASH1, SYMD2, NSD1; histone lysine demethylasessuch as JHDM2a/b, UTX, JMJD3; histone acetyltransferases such as GCNS,PCAF, CBP, p300, TAF1, TIP60/PLIP, MOZ/MYST3, MORF/MYST4, SRC1, ACTR,P160, CLOCK; and DNA demethylases such as Ten-Eleven Translocation (TET)dioxygenase 1 (TET1CD), TET1, DME, DML1, DML2, and ROS1; and functionaldomains thereof.

In some embodiments, a target nucleic acid for increased expressioncomprises NEUROD1, HBG1, ASCL1, LIN28A, or any combination thereof. Insome cases, to increase the expression of target, a guide RNA comprisesa sequence that is at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 98%, at least 99% or 100% identical to anyone of SEQ ID NOs: 647-710.

In some embodiments, fusions partners inhibit or reduce expression of atarget nucleic acid. Fusion proteins comprising such fusion partners andan effector protein may be referred to as CRISPRi fusions. In someembodiments, fusion partners reduce expression of the target nucleicacid relative to its expression in the absence of the fusion effectorprotein. Relative expression, including transcription and RNA levels,may be assessed, quantified, and compared, e.g., by RT-qPCR. In someembodiments, fusion partners may comprise a transcriptional repressor.In some embodiments, a transcriptional repressor can describe apolypeptide or a fragment thereof that is capable of arresting,preventing, or reducing transcription of a target nucleic acid.Transcriptional repressors may inhibit transcription via: recruitment ofother transcription factor proteins; modification of target DNA such asmethylation; recruitment of a DNA modifier; modulation of histonesassociated with target DNA; recruitment of a histone modifier such asthose that modify acetylation and/or methylation of histones; or acombination thereof.

Non-limiting examples of fusion partners that decrease or inhibittranscription include, but are not limited to: transcriptionalrepressors such as the Krüppel associated box (KRAB or SKD); KOX1repression domain; the Mad mSIN3 interaction domain (SID); the ERFrepressor domain (ERD), the SRDX repression domain (e.g., for repressionin plants); histone lysine methyltransferases such as Pr-SET7/8,SUV4-20H1, RIZ1, and the like; histone lysine demethylases such asJMJD2A/JHDM3A, JMJD2B, JMJD2C/GASC1, JMJD2D, JARID1A/RBP2,JARID1B/PLU-1, JARID1C/SMCX, JARID1D/SMCY; histone lysine deacetylasessuch as HDAC1, HDAC2, HDAC3, HDAC8, HDAC4, HDAC5, HDAC7, HDAC9, SIRT1,SIRT2, HDAC11; DNA methylases such as HhaI DNA m5c-methyltransferase(M.HhaI), DNA methyltransferase 1 (DNMT1), DNA methyltransferase 3a(DNMT3a), DNA methyltransferase 3b (DNMT3b), METI, DRM3 (plants), ZMET2,CMT1, CMT2 (plants); and periphery recruitment elements such as Lamin A,and Lamin B; and functional domains thereof.

In some instances, fusion partners include, but are not limited to, aprotein that directly and/or indirectly provides for increased ordecreased transcription and/or translation of a target nucleic acid(e.g., a transcription activator or a fragment thereof, a protein orfragment thereof that recruits a transcription activator, a smallmolecule/drug-responsive transcription and/or translation regulator, atranslation-regulating protein, etc.). In some instances, fusionpartners that increase or decrease transcription include a transcriptionactivator domain or a transcription repressor domain, respectively.

Base Editors

In some embodiments, fusion partners modify a nucleobase of a targetnucleic acid. Fusion proteins comprising such fusion partners and aneffector protein may be referred to as base editors. When a base editoris described herein, it can refer to a fusion protein comprising a baseediting enzyme fused to an effector protein. The base editor isfunctional when the effector protein is coupled to a guide nucleic acid.The guide nucleic acid imparts sequence specific activity to the baseeditor. By way of non-limiting example, the effector protein maycomprise a catalytically inactive effector protein. Also, by way ofnon-limiting example, the base editing enzyme may comprise deaminaseactivity. Additional base editors are described herein.

In some embodiments, fusion partners modify a nucleobase of a targetnucleic acid. Fusion proteins comprising such fusion partners and a Caseffector protein may be referred to as base editors. In someembodiments, base editors modify a sequence of a target nucleic acid. Insome embodiments, base editors provide a nucleobase change in a DNAmolecule. In some embodiments, the nucleobase change in the DNA moleculeis selected from: an adenine (A) to guanine (G); cytosine (C) to thymine(T); and cytosine (C) to guanine (G). In some embodiments, base editorsprovide a nucleobase change in an RNA molecule. In some embodiments, thenucleobase change in the RNA molecule is selected from: adenine (A) toguanine (G); uracil (U) to cytosine (C); cytosine (C) to guanine (G);and guanine (G) to adenine (A). In some embodiments, the fusion partneris a deaminase, e.g., ADAR1/2.

In some embodiments, a base editor comprises a fusion protein comprisinga base editing enzyme fused to an effector protein. The base editor isfunctional when the effector protein is coupled to a guide nucleic acid.The guide nucleic acid imparts sequence specific activity to the baseeditor. By way of non-limiting example, the effector protein maycomprise a catalytically inactive effector protein. Also, by way ofnon-limiting example, the base editing enzyme may comprise deaminaseactivity.

Some base editors modify a nucleobase of on a single strand of DNA. Insome embodiments, base editors modify a nucleobase on both strands ofdsDNA. In some embodiments, upon binding to its target locus in DNA,base pairing between the guide RNA and target DNA strand leads todisplacement of a small segment of single-stranded DNA in an “R-loop”.In some embodiments, DNA bases within the R-loop are modified by thedeaminase enzyme. In some embodiments, DNA base editors for improvedefficiency in eukaryotic cells comprise a catalytically inactiveeffector protein that may generate a nick in the non-edited DNA strand,inducing repair of the non-edited strand using the edited strand as atemplate.

Some base editors modify a nucleobase of an RNA. In some embodiments,RNA base editors comprise an adenosine deaminase. In some embodiments,ADAR proteins bind to RNAs and alter their sequence by changing anadenosine into an inosine. In some embodiments, RNA base editorscomprise a Cas effector protein that is activated by or binds RNA.Non-limiting examples of Cas effector proteins that are activated by orbind RNA are Cas13 proteins.

In some embodiments, base editors are used to treat a subject having ora subject suspected of having a disease related to a gene of interest.In some embodiments, base editors are useful for treating a disease or adisorder caused by a point mutation in a gene of interest. In someembodiments, compositions comprise a base editor and a guide nucleicacid, wherein the guide nucleic acid directs the base editor to asequence in a target gene. The target gene may be associated with adisease. In some embodiments, the guide nucleic acid directs that baseeditor to or near a mutation in the sequence of a target gene. Themutation may be the deletion of one more nucleotides. The mutation maybe the addition of one or more nucleotides. The mutation may be thesubstitution of one or more nucleotides. The mutation may be theinsertion, deletion or substitution of a single nucleotide, alsoreferred to as a point mutation. The point mutation may be a SNP. Themutation may be associated with a disease. In some embodiments, theguide nucleic acid directs the base editor to bind a target sequencewithin the target nucleic acid that is within 0, 1, 2, 3, 4, 5, 6, 7, 8,9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 nucleotides of themutation. In some embodiments, the guide nucleic acid comprises asequence that is identical, complementary or reverse complementary to atarget sequence of a target nucleic acid that comprises the mutation. Insome embodiments, the guide nucleic acid comprises a sequence that isidentical, complementary or reverse complementary to a target sequenceof a target nucleic acid that is within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 nucleotides of the mutation.

Some base editors modify a nucleobase of an RNA. In some embodiments,RNA base editors comprise an adenosine deaminase. In some embodiments,ADAR proteins bind to RNAs and alter their sequence by changing anadenosine into an inosine. In some embodiments, RNA base editorscomprise a Cas effector protein that is activated by or binds RNA.Non-limiting examples of Cas effector proteins that are activated by orbind RNA are Cas13 proteins.

In some embodiments, base editors are used to treat a subject having ora subject suspected of having a disease related to a gene of interest.In some embodiments, base editors are useful for treating a disease or adisorder caused by a point mutation in a gene of interest. In someembodiments, compositions comprise a base editor and a guide nucleicacid, wherein the guide nucleic acid directs the base editor to asequence in a target gene

In some embodiments, fusion partners comprise a base editing enzyme. Insome embodiments, the base editing enzyme modifies the nucleobase of adeoxyribonucleotide. In some embodiments, the base editing enzymemodifies the nucleobase of a ribonucleotide. A base editing enzyme thatconverts a cytosine to a guanine or thymine may be referred to as acytosine base editing enzyme. A base editing enzyme that converts anadenine to a to a guanine may be referred to as an adenine base editingenzyme. In some embodiments, the base editing enzyme comprises adeaminase enzyme. In some embodiments, the deaminase functions as amonomer. In some embodiments, the deaminase functions as heterodimerwith an additional protein. In some embodiments, base editors comprise aDNA glycosylase inhibitor. In some embodiments, base editors comprise auracil glycosylase inhibitor (UGI) or uracil N-glycosylase (UNG). Insome embodiments, base editors do not comprise a UGI. In someembodiments, base editors do not comprise a UNG. In some embodiments,base editors do not comprise a functional fragment of a UGI. Afunctional fragment of a UGI is a fragment of a UGI that is capable ofexcising a uracil residue from DNA by cleaving an N-glycosydic bond. Insome embodiments, a functional fragment, comprises a fragment of aprotein that retains some function relative to the entire protein.

In some embodiments, a base editing enzyme comprises a protein,polypeptide or fragment thereof that is capable of catalyzing thechemical modification of a nucleobase of a deoxyribonucleotide or aribonucleotide. Such a base editing enzyme, for example, is capable ofcatalyzing a reaction that modifies a nucleobase that is present in anucleic acid molecule, such as DNA or RNA (single stranded or doublestranded). Non-limiting examples of the type of modification that a baseediting enzyme is capable of catalyzing includes converting an existingnucleobase to a different nucleobase, such as converting a cytosine to aguanine or thymine or converting an adenine to a guanine, hydrolyticdeamination of an adenine or adenosine, or methylation of cytosine(e.g., CpG, CpA, CpT or CpC). A base editing enzyme itself may or maynot bind to the nucleic acid molecule containing the nucleobase.

In some embodiments, the base editor is a cytidine deaminase base editorgenerated by ancestral sequence reconstruction as described inWO2019226953, which is hereby incorporated by reference in its entirety.

Exemplary deaminase domains are described WO 2018027078 andWO2017070632, and each are hereby incorporated in its entirety byreference. Also, additional exemplary deaminase domains are described inKomor et al., Nature, 533, 420-424 (2016); Gaudelli et al., Nature, 551,464-471 (2017); Komor et al., Science Advances, 3:eaao4774 (2017), andRees et al., Nat Rev Genet. 2018 December; 19(12):770-788. doi:10.1038/s41576-018-0059-1, which are hereby incorporated by reference intheir entirety.

In some embodiments, the base editor is a cytosine base editor (CBE). Ingeneral, a CBE comprises a cytosine base editing enzyme and acatalytically inactive effector protein. In some embodiments, thecatalytically inactive effector protein is a catalytically inactivevariant of a Cas effector protein described herein. The CBE may converta cytosine to a thymine. In some embodiments, the base editor is anadenine base editor (ABE). In general, an ABE comprises an adenine baseediting enzyme and a catalytically inactive effector protein. In someembodiments, the catalytically inactive effector protein is acatalytically inactive variant of a Cas effector protein describedherein. The ABE generally converts an adenine to a guanine. In someembodiments, the base editor is a cytosine to guanine base editor(CGBE). In general, a CGBE converts a cytosine to a guanine.

In some embodiments, the base editor is a CBE. In some embodiments, thecytosine base editing enzyme is a cytosine deaminase. In someembodiments, the cytosine deaminase is an APOBEC1 cytosine deaminase,which accept ssDNA as a substrate but is incapable of cleaving dsDNA,fused to a catalytically inactive effector protein. In some embodiments,when bound to its cognate DNA, the catalytically inactive effectorprotein performs local denaturation of the DNA duplex to generate anR-loop in which the DNA strand not paired with the guide RNA exists as adisordered single-stranded bubble. In some embodiments, thecatalytically inactive effector protein generated ssDNA R-loop enablesthe CBE to perform efficient and localized cytosine deamination invitro. In some examples, deamination activity is exhibited in a windowof about 4 to about 10 base pairs. In some embodiments, fusion to thecatalytically inactive effector protein presents the target site toAPOBEC1 in high effective molarity, enabling the CBE to deaminatecytosines located in a variety of different sequence motifs, withdiffering efficacies. In some embodiments, the CBE is capable ofmediating RNA-programmed deamination of target cytosines in vitro. Insome embodiments, the CBE is capable of mediating RNA-programmeddeamination of target cytosines in vivo. In some embodiments, thecytosine base editing enzyme is a cytosine base editing enzyme describedby Koblan et al. (2018) Nature Biotechnology 36:848-846; Komor et al.(2016) Nature 533:420-424; Koblan et al. (2021) “Efficient C•G-to-G•Cbase editors developed using CRISPRi screens, target-library analysis,and machine learning,” Nature Biotechnology; Kurt et al. (2021) NatureBiotechnology 39:41-46; Zhao et al. (2021) Nature Biotechnology39:35-40; and Chen et al. (2021) Nature Communications 12:1384, allincorporated herein by reference.

In some embodiments, CBEs comprise a uracil glycosylase inhibitor (UGI)or uracil N-glycosylase (UNG). In some embodiments, base excision repair(BER) of U•G in DNA is initiated by a UNG, which recognizes the U•Gmismatch and cleaves the glyosidic bond between uracil and thedeoxyribose backbone of DNA. In some embodiments, BER results in thereversion of the U•G intermediate created by the first CBE back to a CGbase pair. In some embodiments, UNG may be inhibited by fusion of uracilDNA glycosylase inhibitor (UGI), in some embodiments, a small proteinfrom bacteriophage PBS, to the C-terminus of the CBE. In someembodiments, UGI is a DNA mimic that potently inhibits both human andbacterial UNG. In some embodiments, a UGI inhibitor is any protein orpolypeptide that inhibits UNG. In some embodiments, the CBE mediatesefficient base editing in bacterial cells and moderately efficientediting in mammalian cells, enabling conversion of a C•G base pair to aT•A base pair through a U•G intermediate. In some embodiments, the CBEis modified to increase base editing efficiency while editing more thanone strand of DNA.

In some embodiments, the CBE nicks the non-edited DNA strand. In someembodiments, the non-edited DNA strand nicked by the CBE biases cellularrepair of the U•G mismatch to favor a U•A outcome, elevating baseediting efficiency. In some embodiments, the APOBEC1-nickase-UGI fusionefficiently edits in mammalian cells, while minimizing frequency ofnon-target indels.

In some embodiments, the cytidine deaminase is selected from APOBEC1,APOBEC2, APOBEC3C, APOBEC3D, APOBEC3F, APOBEC3G, APOBEC3H, APOBEC4,APOBEC3A, BE1 (APOBEC1-XTEN-dCas9), BE2 (APOBEC1-XTEN-dCas9-UGI), BE3(APOBEC1-XTEN-dCas9(A840H)-UGI), BE3-Gam, saBE3, saBE4-Gam, BE4,BE4-Gam, saBE4, or saBE4-Gam as described in WO2021163587, WO202108746,WO2021062227, and WO2020123887, which are incorporated herein byreference in their entirety.

In some embodiments, the fusion protein further comprises a non-proteinuracil-DNA glcosylase inhibitor (npUGI). In some embodiments, the npUGIis selected from a group of small molecule inhibitors of uracil-DNAglycosylase (UDG), or a nucleic acid inhibitor of UDG. In someembodiments, the non-protein uracil-DNA glcosylase inhibitor (npUGI) isa small molecule derived from uracil. Examples of small moleculenon-protein uracil-DNA glcosylase inhibitors, fusion proteins, andCas-CRISPR systems comprising base editing activity are described inWO202108746, which is incorporated by reference in its entirety.

In some embodiments, the fusion partner is a deaminase, e.g., ADAR1/2,ADAR-2, or AID. In some embodiments, the base editor is an ABE. In someembodiments, the adenine base editing enzyme of the ABE is an adenosinedeaminase. In some embodiments, the adenine base editing enzyme isselected from ABE8e, ABE8.20m, APOBEC3A, Anc APOBEC, and BtAPOBEC2. Insome embodiments, the ABE base editor is an ABET base editor. In someembodiments, the deaminase or enzyme with deaminase activity is selectedfrom ABE8.1m, ABE8.2m, ABE8.3m, ABE8.4m, ABE8.5m, ABE8.6m, ABE8.7m,ABE8.8m, ABE8.9m, ABE8.10m, ABE8.11m, ABE8.12m, ABE8.13m, ABE8.14m,ABE8.15m, ABE8.16m, ABE8.17m, ABE8.18m, ABE8.19m, ABE8.20m, ABE8.21m,ABE8.22m, ABE8.23m, ABE8.24m, ABE8.1d, ABE8.2d, ABE8.3d, ABE8.4d,ABE8.5d, ABE8.6d, ABE8.7d, ABE8.8d, ABE8.9d, ABE8.10d, ABE8.11d,ABE8.12d, ABE8.13d, ABE8.14d, ABE8.15d, ABE8.16d, ABE8.17d, ABE8.18d,ABE8.19d, ABE8.20d, ABE8.21d, ABE8.22d, ABE8.23d, or ABE8.24d. In someembodiments, the adenine base editing enzyme is ABE8.1d. In someembodiments, the adenosine base editor is ABE9. Exemplary deaminases aredescribed in US20210198330, WO2021041945, WO2021050571A1, andWO2020123887, all of which are incorporated herein by reference in theirentirety. Sequences of a selection of these enzymes are provided inTABLE 2. In some embodiments, the adenine base editing enzyme is anadenine base editing enzyme described in Chu et al., (2021) The CRISPRJournal 4:2:169-177, incorporated herein by reference. In someembodiments, the adenine deaminase is an adenine deaminase described byKoblan et al. (2018) Nature Biotechnology 36:848-846, incorporatedherein by reference. In some embodiments, the adenine base editingenzyme is an adenine base editing enzyme described by Tran et al. (2020)Nature Communications 11:4871. Additional examples of deaminase domainsare also described in WO2018027078 and WO2017070632, which are herebyincorporated by reference in their entirety.

In some embodiments, an ABE converts an A•T base pair to a G•C basepair. In some embodiments, the ABE converts a target A•T base pair toG•C in vivo. In some embodiments, the ABE converts a target A•T basepair to G•C in vitro. In some embodiments, ABEs provided herein reversespontaneous cytosine deamination, which has been linked to pathogenicpoint mutations. In some embodiments, ABEs provided herein enablecorrection of pathogenic SNPs (˜47% of disease-associated pointmutations). In some embodiments, the adenine comprises exocyclic aminethat has been deaminated (e.g., resulting in altering its base pairingpreferences). In some embodiments, deamination of adenosine yieldsinosine. In some embodiments, inosine exhibits the base-pairingpreference of guanine in the context of a polymerase active site,although inosine in the third position of a tRNA anticodon is capable ofpairing with A, U, or C in mRNA during translation. In some embodiments,an ABE comprises an engineered adenosine deaminase enzyme capable ofacting on ssDNA.

In some embodiments, a base editor comprises an adenosine deaminasevariant that differs from a naturally occurring deaminase. Relative tothe naturally occurring deaminase, the adenosine deaminase variant maycomprise a V82S alteration, a T166R alteration, or a combinationthereof. In some embodiments, the adenosine deaminase variant comprisesat least one of the following alterations relative to a naturallyoccurring adenosine deaminase: Y147T, Y147R, Q154S, Y123H, and Q154R.,which are incorporated herein by reference in their entirety.

In some embodiments, a base editor comprises a deaminase dimer. In someembodiments, a base editor is a deaminase dimer further comprising abase editing enzyme and an adenine deaminase (e.g., TadA).

In some embodiments, the adenosine deaminase is a TadA monomer (e.g.,Tad*7.10, TadA*8 or TadA*9). In some embodiments, the adenosinedeaminase is a TadA*8 variant. Such a TadA*8 variant includes TadA*8.1,TadA*8.2, TadA*8.3, TadA*8.4, TadA*8.5, TadA*8.6, TadA*8.7, TadA*8.8,TadA*8.9, TadA*8.10, TadA*8.11, TadA*8.12, TadA*8.13, TadA*8.14,TadA*8.15, TadA*8.16, TadA*8.17, TadA*8.18, TadA*8.19, TadA*8.20,TadA*8.21, TadA*8.22, TadA*8.23, or TadA*8.24 as described inWO2021163587 and WO2021050571, which are each hereby incorporated byreference in its entiry.

In some embodiments, a base editor is a deaminase dimer comprising abase editing enzyme fused to TadA via a linker. In some embodiments thelinker comprises or consists of at least a portion of the sequence:

In some embodiments, the amino terminus of the fusion partner protein islinked to the carboxy terminus of the effector protein via the linker.In some embodiments, the carboxy terminus ofthe fusion partner proteinis linked to the amino terminus of the effector protein via the linker.

In some embodiments, the base editing enzyme is fused to TadA at theN-terminus. In some embodiments, the base editing enzyme is fused toTadA at the C-terminus. In some embodiments, the base editing enzyme isa deaminase dimer comprising an ABE. In some embodiments, the deaminasedimer comprises an adenosine deaminase. In some embodiments, thedeaminase dimer comprises TadA fused to an adenine base editing enzymeselected from ABE8e, ABE8.20m, APOBEC3A, Anc APOBEC, and BtAPOBEC2. Insome embodiments TadA is fused to ABE8e or a variant thereof. In someembodiments TadA is fused to ABE8e or a variant thereof at theamino-terminus (ABE8e-TadA). In some embodiments, TadA is fused to ABE8eor a variant thereof at the carboxy terminus (ABE8e-TadA).

In some embodiments, the amino terminus of the fusion partner protein islinked to the carboxy terminus of the effector protein via the linker.In some embodiments, the carboxy terminus of the fusion partner proteinis linked to the amino terminus of the effector protein via the linker.In some embodiments, a linker can comprise a XTEN10 linker (SEQ ID NO:711), an XTEN40 linker (SEQ ID NO: 734) or an XTEN80 linker (SEQ ID NO:735). In some embodiments, a linker can comprise an amino acid sequencethat is at least 60%, at least 65%, at least 70%, at least 75%, at least80%, at least 85%, at least 90%, at least 95%, at least 98%, at least99% or 100% identical to SEQ ID NOs: 711, 734, or 735.

In some embodiments, fusion partners comprise an amino acid sequencethat is at least 50%, at least 55%, at least 60%, at least 65%, at least70%, at least 75%, at least 80%, at least 85%, at least 90%, at least95%, at least 98%, at least 99% or 100% identical to ABE8e (SEQ ID NO:713). In some embodiments, fusion partners comprise an amino acidsequence that is at least 50%, at least 55%, at least 60%, at least 65%,at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 95%, at least 98%, at least 99% or 100% identical to ABE8.20m (SEQID NO: 714). In some embodiments, fusion partners comprise an amino acidsequence that is at least 50%, at least 55%, at least 60%, at least 65%,at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 95%, at least 98%, at least 99% or 100% identical to APOBEC3 (SEQID NO: 732). In some embodiments, fusion partners comprise an amino acidsequence that is at least 50%, at least 55%, at least 60%, at least 65%,at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 95%, at least 98%, at least 99% or 100% identical to AncBE4Max(SEQ ID NO: 733).

Modifying Proteins

In some instances, a fusion partner provides enzymatic activity thatmodifies a protein (e.g., a histone) associated with a target nucleicacid. Such enzymatic activities include, but are not limited to,methyltransferase activity, demethylase activity, acetyltransferaseactivity, deacetylase activity, kinase activity, phosphatase activity,ubiquitin ligase activity, deubiquitinating activity, adenylationactivity, deadenylation activity, SUMOylating activity, deSUMOylatingactivity, ribosylation activity, deribosylation activity, myristoylationactivity, and demyristoylation activity.

In some instances, the fusion partner has enzymatic activity thatmodifies a protein associated with a target nucleic acid. The proteinmay be a histone, an RNA binding protein, or a DNA binding protein.Examples of such protein modification activities includemethyltransferase activity such as that provided by a histonemethyltransferase (HMT) (e.g., suppressor of variegation 3-9 homolog 1(SUV39H1, also known as KMT1A), euchromatic histone lysinemethyltransferase 2 (G9A, also known as KMT1C and EHMT2), SUV39H2,ESET/SETDB1, SET1A, SET1B, MLL1 to 5, ASH1, SYMD2, NSD1, DOT1L,Pr-SET7/8, SUV4-20H1, EZH2, RIZ1); demethylase activity such as thatprovided by a histone demethylase (e.g., Lysine Demethylase 1A (KDM1Aalso known as LSD1), JHDM2a/b, JMJD2A/JHDM3A, JMJD2B, JMJD2C/GASC1,JMJD2D, JARID1A/RBP2, JARID1B/PLU-1, JARID1C/SMCX, JARID1D/SMCY, UTX,JMJD3); acetyltransferase activity such as that provided by a histoneacetylase transferase (e.g., catalytic core/fragment of the humanacetyltransferase p300, GCNS, PCAF, CBP, TAF1, TIP60/PLIP, MOZ/MYST3,MORF/MYST4, HBO1/MYST2, HMOF/MYST1, SRC1, ACTR, P160, CLOCK);deacetylase activity such as that provided by a histone deacetylase(e.g., HDAC1, HDAC2, HDAC3, HDAC8, HDAC4, HDAC5, HDAC7, HDAC9, SIRT1,SIRT2, HDAC11); kinase activity, phosphatase activity, ubiquitin ligaseactivity, deubiquitinating activity, adenylation activity, deadenylationactivity, SUMOylating activity, deSUMOylating activity, ribosylationactivity, deribosylation activity, myristoylation activity, anddemyristoylation activity.

In some instances, the fusion partner is a chloroplast transit peptide(CTP), also referred to as a plastid transit peptide. In some instances,this targets the fusion protein to a chloroplast. Chromosomal transgenesfrom bacterial sources must have a sequence encoding a CTP sequencefused to a sequence encoding an expressed protein if the expressedprotein is to be compartmentalized in the plant plastid (e.g.chloroplast). The CTP is removed in a processing step duringtranslocation into the plastid. Accordingly, localization of anexogenous protein to a chloroplast is often accomplished by means ofoperably linking a polynucleotide sequence encoding a CTP sequence tothe 5′ region of a polynucleotide encoding the exogenous protein. Insome instances, the CTP is located at the N-terminus of the fusionprotein. Processing efficiency may, however, be affected by the aminoacid sequence of the CTP and nearby sequences at the amino terminus (NH2terminus) of the peptide.

In some instances, the fusion partner is an endosomal escape peptide. Insome instances, an endosomal escape protein comprises the amino acidsequence GLFXALLXLLXSLWXLLLXA (SEQ ID NO: 200), wherein each X isindependently selected from lysine, histidine, and arginine. In someinstances, an endosomal escape protein comprises the amino acid sequenceGLFHALLHLLHSLWHLLLHA (SEQ ID NO: 201). In some instances, the amino acidsequence of the endosomal escape protein is SEQ ID NO: 200 or SEQ ID NO:201.

Prime Editing

In some embodiments, a fusion protein and/or a fusion partner cancomprise a prime editing enzyme. When used herein, a prime editingenzyme can describe a protein, polypeptide, or fragment thereof that iscapable of catalyzing the modification (insertion, deletion, orbase-to-base conversion) of a target nucleotide or nucleotide sequencein a nucleic acid. A prime editing enzyme capable of catalyzing such areaction includes a reverse transcriptase. A prime editing enzyme mayrequire a prime editing guide RNA (pegRNA) to catalyze the modification.Such a pegRNA can be capable of identifying the nucleotide or nucleotidesequence in the target nucleic acid to be edited and encoding the newgenetic information that replaces the targeted nucleotide or nucleotidesequence in the nucleic acid. A prime editing enzyme may require a primeediting guide RNA (pegRNA) and a single guide RNA to catalyze themodification.

In some embodiments, a prime editing enzyme is a protein, a polypeptideor a fragment thereof that is capable of catalyzing the modification(insertion, deletion, or base-to-base conversion) of a target nucleotideor nucleotide sequence in a nucleic acid. A prime editing enzyme capableof catalyzing such a reaction includes a reverse transcriptase. A primeediting enzyme may require a prime editing guide RNA (pegRNA) tocatalyze the modification. Such a pegRNA can be capable of identifyingthe nucleotide or nucleotide sequence in the target nucleic acid to beedited and encoding the new genetic information that replaces thetargeted nucleotide or nucleotide sequence in the nucleic acid. A primeediting enzyme may require a prime editing guide RNA (pegRNA) and asingle guide RNA to catalyze the modification. In some embodiments, sucha prime editing enzyme is an M-MLV RT enzyme or a mutant thereof. Insome embodiments, the M-MLV RT enzyme comprises at least one mutationselected from D200N, L603W, T330P, T306K, and W313F relative to wildtypeM-MLV RT enzyme.

Recombinases

In some embodiments, the fusion partners comprise a recombinase domain.In some embodiments, the enzymatically inactive protein is fused with arecombinase. In some embodiments, the recombinase is a site-specificrecombinase. In some embodiments, the fusion partners comprise arecombinase domain wherein the recombinase is a site-specificrecombinase. In some embodiments, described herein is a programmednuclease comprising reduced nuclease activity or no nuclease activityand fused with a recombinase, wherein the recombinase can be asite-specific recombinase. Such polypeptides can be used forsite-directed transgene insertion. Examples of site-specificrecombinases include a tyrosine recombinase (e.g., Cre, Flp or lambdaintegrase), a serine recombinase (e.g., gamma-delta resolvase, Tn3resolvase, Sin resolvase, Gin invertase, Hin invertase, Tn5044resolvase, IS607 transposase and integrase), or mutants or variantsthereof. In some embodiments, the recombinase is a serine recombinase.Non-limiting examples of serine recombinases include, but are notlimited to, gamma-delta resolvase, Tn3 resolvase, Sin resolvase, Gininvertase, Hin invertase, Tn5044 resolvase, IS607 transposase, and IS607integrase. In some embodiments, the site-specific recombinase is anintegrase. Non-limiting examples of integrases include, but are notlimited to:Bxb1, wBeta, BL3, phiR4, A118, TG1, MR11, phi370, SPBc,TP901-1, phiRV, FC1, K38, phiBT1, and phiC31. Further discussion andexamples of suitable recombinase fusion partners are described in U.S.Pat. No. 10,975,392, which is incorporated herein by reference in itsentirety.

In some embodiments, the fusion protein comprises a linker that linksthe recombinase domain to the Cas-CRISPR domain of the effector protein.In some embodiments, the linker is The-Ser.

Additional Fusion Partners

In some embodiments, the fusion partner is a nuclear localization signal(NLS). In some cases, said NLS may have a sequence of KRPAATKKAGQAKKKKEF(SEQ ID NO: 800). The NLS can be selected to match the cell type ofinterest, for example several NLSs are known to be functional indifferent types of eukaryotic cell e.g. in mammalian cells. SuitableNLSs include the SV40 large T antigen NLS (PKKKRKV, SEQ ID NO: 712) andthe c-Myc NLS (PAAKRVKLD, SEQ ID NO: 801). In some embodiments, an NLSmay be the SV40 large T antigen NLS or the c-Myc NLS. NLSs that arefunctional in plant cells are described in Chang et al., (Plant SignalBehay. 2013 October; 8(10):e25976). In some embodiments, an NLS sequencecan be selected from the following consensus sequences: KR(K/R)R (SEQ IDNO: 802), K(K/R)RK (SEQ ID NO: 803); (P/R)XXKR({circumflex over( )}DE)(K/R) (SEQ ID NO: 804); KRX(W/F/Y)XXAF; (SEQ ID NO: 805);(R/P)XXKR(K/R)({circumflex over ( )}DE) (SEQ ID NO: 806);LGKR(K/R)(W/F/Y) (SEQ ID NO: 807); KRX10-12K(KR)(KR) (SEQ ID NO: 808) orKRX10-12K(KR)X(K/R) (SEQ ID NO: 809). In some cases, ({circumflex over( )}DE) means any amino acid besides Asp or Glu. In some cases, X10-12means 10, 11, or 12 residues of X (any amino acid). In some cases a “/”means either residue 1 or residue 2, for example (K/R) means residue Kor R. In some cases, the NLS is linked to an effector protein by anamine group, also referred to as a peptide bond, or by one or more aminoacids.

In some embodiments, the nucleoplasmin NLS (KRPAATKKAGQAKKKKEF (SEQ IDNO: 800)) is linked or fused to the C-terminus of the effector protein.In some embodiments, the SV40 NLS (PKKKRKVGIHGVPAA) (SEQ ID NO: 810) islinked or fused to the N-terminus of the effector protein. In preferredembodiments, the nucleoplasmin NLS (SEQ ID NO: 800) is linked or fusedto the C-terminus of the effector protein and the SV40 NLS (SEQ ID NO:810) is linked or fused to the N-terminus of the effector protein.

Further suitable fusion partners include, but are not limited to,proteins (or fragments/domains thereof) that are boundary elements(e.g., CTCF), proteins and fragments thereof that provide peripheryrecruitment (e.g., Lamin A, Lamin B, etc.), protein docking elements(e.g., FKBP/FRB, Pil1/Aby1, etc.).

Linkers for Fusion Partners

In general, effector proteins and fusion partners of a fusion effectorprotein are connected via a linker. The linker may comprise or consistof a covalent bond. The linker may comprise or consist of a chemicalgroup. In some embodiments, the linker comprises an amino acid. In somecases, a linker comprises a bond or molecule that links a firstpolypeptide to a second polypeptide. In some instances, a peptide linkercomprises at least two amino acids linked by an amide bond. In general,the linker connects a terminus of the effector protein to a terminus ofthe fusion partner. In some embodiments, the carboxy terminus of theeffector protein is linked to the amino terminus of the fusion partner.In some embodiments, the carboxy terminus of the fusion partner islinked to the amino terminus of the effector protein.

In some instances, a terminus of the D2S effector protein is linked to aterminus of the fusion partner through an amide bond. In some instances,a D2S effector protein is coupled to a fusion partner via a linkerprotein. In some embodiments, a linker, comprises a bond or moleculethat links a first polypeptide to a second polypeptide. A peptide linkercomprises at least two amino acids linked by an amide bond. The linkerprotein may have any of a variety of amino acid sequences. A linkerprotein may comprise a region of rigidity (e.g., beta sheet, alphahelix), a region of flexibility, or any combination thereof. In someinstances, the linker comprises small amino acids, such as glycine andalanine, that impart high degrees of flexibility. The ordinarily skilledartisan will recognize that design of a peptide conjugated to anydesired element may include linkers that are all or partially flexible,such that the linker may include a flexible linker as well as one ormore portions that confer less flexible structure. Suitable linkersinclude proteins of 4 linked amino acids to 40 linked amino acids inlength, or between 4 linked amino acids and 25 linked amino acids inlength. In some embodiments, when linked amino acids are describedherein, it can refer to at least two amino acids linked by an amidebond.

These linkers may be produced by using synthetic, linker-encodingoligonucleotides to couple the proteins, or may be encoded by a nucleicacid sequence encoding a fusion protein (e.g., an effector proteincoupled to a fusion partner). Examples of linker proteins includeglycine polymers (G)n, glycine-serine polymers (including, for example,(GS)n, GSGGSn, GGSGGSn, and GGGSn, where n is an integer of at leastone), glycine-alanine polymers, and alanine-serine polymers. Exemplarylinkers may comprise amino acid sequences including, but not limited to,GS (SEQ ID NO: 169), GSGGS (SEQ ID NO: 170), GGSGGS (SEQ ID NO: 171),GGGS (SEQ ID NO: 172), GGSG (SEQ ID NO: 173), GGSGG (SEQ ID NO: 174),GSGSG (SEQ ID NO: 175), GSGGG (SEQ ID NO: 176), GGGSG (SEQ ID NO: 177),and GSSSG (SEQ ID NO: 178).

In some embodiments, an effector protein described herein is purified.For example, a D2S effector protein is purified for ex vivoribonucleoprotein editing. In some instances, an effector protein ispurified with a TEV-cleavable maltose binding protein (MBP) tag. In someinstances, an effector protein comprises a His tag, a FLAG tag, a GFPtag, or a combination of tags. For example, an effector protein of SEQID NOs: 1-45, 202-293, or 728-731 can comprise a component (e.g. tag)disclosed in Table 37. In some instances, an effector protein comprisesa T2A tag. In some cases, TEV cleavage occurs before the effectorprotein is introduced into a cell. After TEV cleavage, an effectorprotein's N terminus retains three additional amino acids (SerAsnAla;SNA), this also occurs when nuclear localization signal are added to theeffector protein. In some cases, an effector protein purified with aTEV-cleavable maltose binding protein (MBP) tag is delivered to a cellwith a lipid nanoparticle (LNP). In some cases, a TEV cleaved version ofan effector protein is used for ex vivo purposes. In some cases, a TEVcleaved version of an effector protein is used for in vivo purposes.

In some embodiments, a guide RNA for editing a target nucleic acidcomprises a sequence that is at least is at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to any one of SEQ ID NOs: 715-727.

Nuclease-Dead D2S Effector Proteins

In some instances, the D2S effector protein can comprise anenzymatically inactive (e.g., catalytically inactive) and/or “dead”(abbreviated by “d”) effector protein in combination (e.g., fusion) witha polypeptide comprising recombinase activity. Although a D2S effectorprotein normally has nuclease activity, in some instances, a D2Seffector protein does not have nuclease activity. In some instances, aneffector protein comprising at least 60%, at least 65%, at least 70%, atleast 75%, at least 80%, at least 85%, at least 90%, at least 92%, atleast 95%, at least 97%, at least 98%, at least 99%, or 100% sequenceidentity with any one of SEQ ID NO: 1-45, 202-293, or 728-731 is anuclease-dead effector protein. In some instances, the effector proteincomprising at least 60%, at least 65%, at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 92%, at least 95%, atleast 97%, at least 98%, at least 99%, or 100% sequence identity withany one of SEQ ID NO: 1-45 and 202-293 is modified or engineered to be anuclease-dead effector protein. In some instances, an effector proteincomprising at least 60%, at least 65%, at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 92%, at least 95%, atleast 97%, at least 98%, at least 99%, or 100% sequence identity withany one of SEQ ID NOs: 728-731 is a nuclease-dead effector protein.

In some embodiments, catalytic residues of a RuvC domain are a firstaspartic acid (D), glutamic acid (E), and a second aspartic acid (D). Insome embodiments, the catalytic active residues of CasM.19952 (SEQ IDNO: 23) are D267, E363, and D450. Many amino acid replacements of anycatalytic residue can inactivate the nuclease. The most common mutationsare converting these residues to alanine or to other amino acids thatsubstitute the acid side chain while maintaining the structuralsimilarity, e.g., such as D (aspartate) to N (asparagine), or E(glutamate) to Q (glutamine). In some embodiments, D267A, E363A, D450A,D267N, E363Q, D450N are all catalytically dead mutants of CasM.19952. Insome embodiments, D267A is a catalytically inactive mutant ofCasM.286251 (SEQ ID NO: 25).

D2S effector protein can comprise a modified form of a wild typecounterpart. The modified form of the wild type counterpart can comprisean amino acid change (e.g., deletion, insertion, or substitution) thatreduces the nucleic acid-cleaving activity of the effector protein. Forexample, a nuclease domain (e.g., HEPN domain) of a D2S effectorpolypeptide can be deleted or mutated so that it is no longer functionalor comprises reduced nuclease activity. The modified form of theeffector protein can have less than 90%, less than 80%, less than 70%,less than 60%, less than 50%, less than 40%, less than 30%, less than20%, less than 10%, less than 5%, or less than 1% of the nucleicacid-cleaving activity of the wild-type counterpart. The modified formof an effector protein can have no substantial nucleic acid-cleavingactivity. When an effector protein is a modified form that has nosubstantial nucleic acid-cleaving activity, it can be referred to asenzymatically inactive and/or dead. A dead D2S effector polypeptide canbind to a target nucleic acid sequence but may not cleave the targetnucleic acid sequence. A dead D2S effector polypeptide can associatewith a guide nucleic acid to activate or repress transcription of atarget nucleic acid sequence.

V. Multimeric Complexes

Compositions, systems, and methods of the present disclosure maycomprise a multimeric complex or uses thereof, wherein the multimericcomplex comprises multiple effector proteins that non-covalentlyinteract with one another. A multimeric complex may comprise enhancedactivity relative to the activity of any one of its effector proteinsalone. For example, a multimeric complex comprising two D2S effectorproteins may comprise greater nucleic acid binding affinity,cis-cleavage activity, and/or transcollateral cleavage activity thanthat of either of the D2S effector proteins provided in monomeric form.A multimeric complex may have an affinity for a target region of atarget nucleic acid and is capable of catalytic activity (e.g.,cleaving, nicking or modifying the nucleic acid) at or near the targetregion. Multimeric complexes may be activated when complexed with aguide nucleic acid. Multimeric complexes may be activated when complexedwith a guide nucleic acid and a target nucleic acid. In some instances,the multimeric complex cleaves the target nucleic acid. In someinstances, the multimeric complex nicks the target nucleic acid.

Various aspects of the present disclosure include compositions andmethods comprising multiple effector proteins, and uses thereof,respectively. A D2S effector protein comprising at least 70% sequenceidentity to any one of SEQ ID NO: 1-SEQ ID NO: 45 and SEQ ID NO: 202 toSEQ ID NO: 293 may be provided with a second effector protein. A D2Seffector protein comprising at least 70% sequence identity to any one ofSEQ ID NO: 1-SEQ ID NO: 45 and SEQ ID NO: 202 to SEQ ID NO: 293 may beprovided with a second effector protein. A D2S effector proteincomprising at least 70% sequence identity to any one of SEQ ID NO:728-731 may be provided with a second effector protein. Two effectorproteins may target different nucleic acid sequences. Two effectorproteins may target different types of nucleic acids (e.g., a firsteffector protein may target double- and single-stranded nucleic acids,and a second effector protein may only target single-stranded nucleicacids).

In some instances, multimeric complexes comprise at least one D2Seffector protein, or a fusion protein thereof, comprising an amino acidsequence with at least 65%, at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95% or 100% identity to any one of SEQID NOs: 1-45, 202-293, or 728-731. In some instances, multimericcomplexes comprise at least one D2S effector protein or a fusion proteinthereof, wherein the amino acid sequence of the D2S effector protein isat least 65%, at least 70%, at least 75%, at least 80%, at least 85%, atleast 90%, at least 95% or 100% identical to any one of SEQ ID NOs:1-45, 202-293, or 728-731.

In some instances, the multimeric complex is a dimer comprising twoeffector proteins of identical amino acid sequences. In some instances,the multimeric complex comprises a first effector protein and a secondeffector protein, wherein the amino acid sequence of the first effectorprotein is at least 90%, at least 92%, at least 94%, at least 96%, atleast 98% identical, or at least 99% identical to the amino acidsequence of the second effector protein.

In some instances, the multimeric complex is a heterodimeric complexcomprising at least two effector proteins of different amino acidsequences. In some instances, the multimeric complex is a heterodimericcomplex comprising a first effector protein and a second effectorprotein, wherein the amino acid sequence of the first effector proteinis less than 90%, less than 85%, less than 80%, less than 75%, less than70%, less than 65%, less than 60%, less than 55%, less than 50%, lessthan 45%, less than 40%, less than 35%, less than 30%, less than 25%,less than 20%, less than 15%, or less than 10% identical to the aminoacid sequence of the second effector protein.

In some instances, a multimeric complex comprises at least two effectorproteins. In some instances, a multimeric complex comprises more thantwo effector proteins. In some instances, a multimeric complex comprisestwo, three or four effector proteins. In some instances, at least oneeffector protein of the multimeric complex comprises an amino acidsequence with at least 65%, at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95% or 100% identity to any one of SEQID NOs: 1-45, 202-293, or 728-731. In some instances, each effectorprotein of the multimeric complex comprises an amino acid sequence withat least 65%, at least 70%, at least 75%, at least 80%, at least 85%, atleast 90%, at least 95% or 100% identity to any one of SEQ ID NOs: 1-45,202-293, or 728-731.

VI. Engineered Guide RNAs

The compositions, systems, and methods of the present disclosure maycomprise a guide nucleic acid, or a nucleic acid molecule (e.g., DNAmolecule) encoding the guide nucleic acid, or a use thereof. When aguide nucleic acid is described herein, it can refer to a nucleic acidcomprising: a first nucleotide sequence that hybridizes to a targetnucleic acid; and a second nucleotide sequence that is capable ofconnecting an effector protein to the nucleic acid by either a)hybridizing to a portion of an additional nucleic acid that is bound byan effector protein (e.g., a tracrRNA) orb) being non-covalently boundby an effector protein. The first sequence may be referred to herein asa spacer sequence. In some instances, the second sequence may bereferred to herein as a repeat sequence. In some instances, the secondsequence may comprise a portion of, or all of a repeat sequence or atracrRNA. In some instances, the first sequence is located 5′ of thesecond nucleotide sequence. In some instances, the first sequence islocated 3′ of the second nucleotide sequence.

Provided herein are compositions comprising a D2S effector protein andan engineered guide RNA. In general, a guide nucleic acid is a nucleicacid molecule that binds to an effector protein (e.g., a Cas effectorprotein), thereby forming a ribonucleoprotein complex (RNP). In someinstances, the engineered guide RNA imparts activity or sequenceselectivity to the effector protein. In some embodiments a guide nucleicacid comprises a nucleic acid comprising: a first nucleotide sequencethat hybridizes to a target nucleic acid; and a second nucleotidesequence that is capable of being non-covalently bound by an effectorprotein. The first sequence may be referred to herein as a spacersequence. The second sequence may be referred to herein as a repeatsequence. In some instances, the first sequence is located 5′ of thesecond nucleotide sequence. In some instances, the first sequence islocated 3′ of the second nucleotide sequence. Guide nucleic acids, whencomplexed with an effector protein, may bring the effector protein intoproximity of a target nucleic acid. Sufficient conditions forhybridization of a guide nucleic acid to a target nucleic acid and/orfor binding of a guide nucleic acid to an effector protein include invivo physiological conditions of a desired cell type or in vitroconditions sufficient for assaying catalytic activity of a protein,polypeptide or peptide described herein, such as the nuclease activityof an effector protein. Guide nucleic acids may comprise DNA, RNA, or acombination thereof (e.g., RNA with a thymine base). Guide nucleic acidsmay include a chemically modified nucleobase or phosphate backbone.Guide nucleic acids may be referred to herein as a guide RNA (gRNA).However, a guide RNA is not limited to ribonucleotides, but may comprisedeoxyribonucleotides and other chemically modified nucleotides.

In general, the engineered guide RNA comprises a CRISPR RNA (crRNA) thatis at least partially complementary to a target nucleic acid. In somecases, the nucleotide sequence that hybridizes to a target nucleic acidmay be referred to herein as a spacer sequence. In some instances, theengineered guide RNA comprises a trans-activating crRNA (tracrRNA), atleast a portion of which interacts with the effector protein. In someembodiments, a trans-activating RNA (tracrRNA), is a nucleic acid thatcomprises a first sequence that is capable of being non-covalently boundby an effector protein. In some embodiments, tracrRNAs are covalentlylinked to a crRNA. The tracrRNA may hybridize to a portion of the guideRNA that does not hybridize to the target nucleic acid. In someinstances, the crRNA and tracrRNA are provided as a single guide RNA(sgRNA). In some instances, a crRNA and tracrRNA function as twoseparate, unlinked molecules.

In some embodiments, engineered guide RNAs comprise a crRNA or a portionthereof (e.g., a repeat sequence or a spacer sequence). In someembodiments, the crRNA comprises a first sequence, often referred toherein as a spacer sequence, that hybridizes to a target sequence of atarget nucleic acid, and a second sequence that hybridizes to a portionof a tracrRNA, often referred to herein as a repeat sequence. In someembodiments, the repeat sequence is capable of being non-covalentlybound by an effector protein. In some embodiments, the crRNA iscovalently linked to an additional nucleic acid that interacts with theeffector protein. The crRNA may be linked to the additional nucleic acidvia an internucleoside linkage (e.g, a phosphodiester bond orphosphorothioate bond). The crRNA may be linked to the additionalnucleic acid via one or more linker nucleotides. In some embodiments,the additional nucleic acid comprises a tracrRNA. In some embodiments,the additional nucleic acid comprises an intermediary RNA. In suchembodiments, the additional nucleic acid that interacts with theeffector protein, for simplicity, can be referred to herein as atracrRNA or tracrRNA sequence because such an additional nucleic acidcan be based on or derived from a tracrRNA, thereby having all or aportion of a tracrRNA sequence. However, it is recognized that in such acontext the additional nucleic acid is not a true tracrRNA because itdoes not act in trans. In some embodiments, a trans-activating RNA(tracrRNA) comprises a nucleic acid that comprises a first sequence thatis capable of being non-covalently bound by an effector protein.TracrRNAs may comprise a second sequence that hybridizes to a portion ofa crRNA, which may be referred to as a repeat hybridization sequence. Insome embodiments, tracrRNAs are covalently linked to a crRNA. A tracrRNAmay include deoxyribonucleosides, ribonucleosides, chemically modifiednucleosides, or any combination thereof. A tracrRNA may be separatefrom, but form a complex with, a crRNA and an effector protein. AtracrRNA may include a nucleotide sequence that hybridizes with aportion of a crRNA. A tracrRNA may comprise a secondary structure (e.g.,one or more hairpin loops) that facilitates the binding of an effectorprotein to a guide nucleic acid and/or modification activity of aneffector protein on a target nucleic acid. A tracrRNA may include arepeat hybridization region and a hairpin region. The repeathybridization region may hybridize to all or part of the repeat sequenceof a guide nucleic acid. The repeat hybridization region may bepositioned 3′ of the hairpin region. The repeat hybridization region maybe positioned 5′ of the hairpin region. The hairpin region may include afirst sequence, a second sequence that is reverse complementary to thefirst sequence, and a stem-loop linking the first sequence and thesecond sequence.

In some instances, the engineered guide RNA comprises a second sequence,at least a portion of which interacts with the effector protein. In someinstances, the second sequence may be referred to herein as a repeatsequence. In some instances, the second sequence may be referred toherein as a handle sequence. In some instances, the handle sequence maycomprise a portion of, or all of a repeat sequence.

Guide nucleic acids are often referred to as “guide RNA.” However, aguide nucleic acid may comprise deoxyribonucleotides. The term “guideRNA,” as well as crRNA and tracrRNA, includes guide nucleic acidscomprising DNA bases and RNA bases. The term “guide RNA,” which caninclude crRNA, tracrRNA, second sequence, repeat sequence, handlesequence, or any combination thereof, includes guide nucleic acidscomprising DNA bases and RNA bases.

Guide nucleic acids described herein may bind to a D2S effector proteinor multimeric complex thereof, wherein the amino acid sequence of theD2S effector protein is at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 92%, at least 95%, at least 97%, atleast 98%, at least 99%, or 100% identical to any one of SEQ ID NO:1-45, 202-293, or 728-731.

In general, the crRNA comprises a spacer region that hybridizes to atarget sequence of a target nucleic acid, and a repeat region thatinteracts with the D2S effector effector protein. The repeat region mayalso be referred to as a “protein-binding segment.” Typically, therepeat region is adjacent to the spacer region. For example, a guide RNAthat interacts with the D2S effector effector protein comprises a repeatregion that is 5′ of the spacer region. The spacer region of the guideRNA may comprise complementarity with (e.g., hybridize to) a targetsequence of a target nucleic acid. In some cases, the spacer region is15-28 linked nucleosides in length. In some cases, the spacer region is15-26, 15-24, 15-22, 15-20, 15-18, 16-28, 16-26, 16-24, 16-22, 16-20,16-18, 17-26, 17-24, 17-22, 17-20, 17-18, 18-26, 18-24, or 18-22 linkednucleosides in length. In some cases, the spacer region is 18-24 linkednucleosides in length. In some cases, the spacer region is at least 15linked nucleosides in length. In some cases, the spacer region is atleast 16, 18, 20, or 22 linked nucleosides in length. In some cases, thespacer region comprises at least 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides. In somecases, the spacer region is at least 17 linked nucleosides in length. Insome cases, the spacer region is at least 18 linked nucleosides inlength. In some cases, the spacer region is at least 20 linkednucleosides in length. In some cases, the spacer region is at least 80%,at least 85%, at least 90%, at least 95% or 100% complementary to atarget sequence of the target nucleic acid. In some cases, the spacerregion is 100% complementary to the target sequence of the targetnucleic acid. In some cases, the spacer region comprises at least 15contiguous nucleobases that are complementary to the target nucleicacid.

In some embodiments, complementary and “complementarity, with referenceto a nucleic acid molecule or nucleotide sequence, comprise thecharacteristic of a polynucleotide having nucleotides that base pairwith their Watson-Crick counterparts (C with G; or A with T) in areference nucleic acid. For example, when every nucleotide in apolynucleotide forms a base pair with a reference nucleic acid, thatpolynucleotide is said to be 100% complementary to the reference nucleicacid. In a double stranded DNA or RNA sequence, the upper (sense) strandsequence is in general, understood as going in the direction from its5′- to 3′-end, and the complementary sequence is thus understood as thesequence of the lower (antisense) strand in the same direction as theupper strand. Following the same logic, the reverse sequence isunderstood as the sequence of the upper strand in the direction from its3′- to its 5′-end, while the ‘reverse complement’ sequence or the‘reverse complementary’ sequence is understood as the sequence of thelower strand in the direction of its 5′- to its 3′-end. Each nucleotidein a double stranded DNA or RNA molecule that is paired with itsWatson-Crick counterpart called its complementary nucleotide.

In some instances, the guide RNA does not comprise a tracrRNA. In somecases, a D2S effector protein does not require a tracrRNA to locateand/or cleave a target nucleic acid. In some instances, the crRNA of theguide nucleic acid comprises a repeat region and a spacer region,wherein the repeat region binds to the D2S effector protein and thespacer region hybridizes to a target sequence of the target nucleicacid. The repeat sequence of the crRNA may interact with a D2S effectorprotein, allowing for the guide nucleic acid and the D2S effectorprotein to form an RNP complex. In some instances, the guide nucleicacid comprises a crRNA comprising a spacer region, and a repeat regionor handle region wherein at least a portion of the repeat or handleregion binds to the D2S effector protein and the spacer regionhybridizes to a target sequence of the target nucleic acid. The repeatsequence of the nucleic acid may interact with a D2S effector protein,allowing for the guide nucleic acid and the D2S effector protein to forman RNP complex.

In some cases, a D2S effector protein or a multimeric complex thereofcleaves a precursor RNA (“pre-crRNA”) to produce a guide RNA, alsoreferred to as a “mature guide RNA.” A D2S effector protein that cleavespre-crRNA to produce a mature guide RNA is said to have pre-crRNAprocessing activity. In some cases, a repeat region of a guide RNAcomprises mutations or truncations relative to respective regions in acorresponding pre-crRNA.

In some embodiments, the term “region” as used herein may be used todescribe a portion of or all of a corresponding sequence, for example, aspacer region is understood to comprise a portion of or all of a spacersequence.

The guide RNA may bind to a target nucleic acid (e.g., a single strandof a target nucleic acid) or a portion thereof. The guide nucleic acidmay bind to a target nucleic acid such as a nucleic acid from abacterium, a virus, a parasite, a protozoa, a fungus or other agentsresponsible for a disease, or an amplicon thereof. The target nucleicacid may comprise a mutation, such as a single nucleotide polymorphism(SNP). A mutation may confer for example, resistance to a treatment,such as antibiotic treatment. The guide nucleic acid may bind to atarget nucleic acid, such as DNA or RNA, from a cancer gene or geneassociated with a genetic disorder, or an amplicon thereof, as describedherein. The guide nucleic acid may comprise a first region complementaryto a target nucleic acid (FR1) and a second region that is notcomplementary to the target nucleic acid (FR2). In some cases, FR1 islocated 5′ to FR2 (FR1-FR2). In some cases, FR2 is located 5′ to FR1(FR2-FR1).

In some cases, the guide comprises 10, 11, 12, 13, 14, 15, 16, 17, 18,19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 linked nucleosides. Ingeneral, a guide nucleic acid comprises at least linked nucleosides. Insome instances, a guide nucleic acid comprises at least 25 linkednucleosides. A guide nucleic acid may comprise 10 to 50 linkednucleosides. In some cases, the guide nucleic acid comprises or consistsessentially of about 12 to about 80 linked nucleosides, about 12 toabout 50, about 12 to about 45, about 12 to about 40, about 12 to about35, about 12 to about 30, about 12 to about 25, from about 12 to about20, about 12 to about 19, about 19 to about 20, about 19 to about 25,about 19 to about 30, about 19 to about 35, about 19 to about 40, about19 to about 45, about 19 to about 50, about 19 to about 60, about 20 toabout 25, about 20 to about 30, about 20 to about 35, about 20 to about40, about 20 to about 45, about 20 to about 50, or about 20 to about 60linked nucleosides. In some cases, the guide nucleic acid has about 10to about 60, about 20 to about 50, or about 30 to about 40 linkednucleosides.

The terms “nucleotide” and “nucleoside” when used in the context of anucleic acid molecule having multiple residues are used interchangeablyand mean the sugar and base of the residue contained in the nucleic acidmolecule. The term “nucleobase” when used in the context of a nucleicacid molecule can refer to the base of the residue contained in thenucleic acid molecule, for example, the base of a nucleotide or anucleoside.

In some embodiments, the guide nucleic acid comprises a nucleotidesequence as described herein (e.g., TABLE 2). Such nucleotide sequencesdescribed herein (e.g., TABLE 2) may be described as a nucleotidesequence of either DNA or RNA, however, no matter the form the sequenceis described, it is readily understood that such nucleotide sequencescan be revised to be RNA or DNA, as needed, for describing a sequencewithin a guide nucleic acid itself or the sequence that encodes a guidenucleic acid, such as a nucleotide sequence described herein for avector. Similarly, disclosure of the nucleotide sequences describedherein (e.g., TABLE 2) also discloses the complementary nucleotidesequence, the reverse nucleotide sequence, and the reverse complementnucleotide sequence, any one of which can be a nucleotide sequence foruse in a guide nucleic acid as described herein.

TABLE 2 provides exemplary compositions comprising D2S effectorproteins, crRNAs, and tracrRNAs. Each row in TABLE 2 represents anexemplary composition. In some instances, the crRNA comprises anucleobase sequence of any one of SEQ ID NOs: 46-90 as shown in TABLE 2.In some instances, the nucleobase sequence of the crRNA is at least 65%,at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 95%, or 100% identical to any one of SEQ ID NO: 46-SEQ ID NO: 90.In some instances, the tracrRNA comprises a nucleobase sequence of anyone of SEQ ID NOs: 91-148 as shown in TABLE 2. In some instances, thenucleobase sequence of the tracrRNA is at least 65%, at least 70%, atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to any one of SEQ ID NO: 91-SEQ ID NO: 148.

TABLE 2 Exemplary Compositions of D2S Effector Protein, crRNA andtracrRNA Comp. No. Protein crRNA tracrRNA 1 CasM.298706 SEQ ID NO: 46SEQ ID NO: 91 (SEQ ID NO: 1) 2 CasM.280604 SEQ ID NO: 47 SEQ ID NO: 92(SEQ ID NO: 2) 3 CasM.281060 SEQ ID NO: 48 SEQ ID NO: 93 (SEQ ID NO: 3)4 CasM.284933 SEQ ID NO: 49 SEQ ID NO: 94 (SEQ ID NO: 4) 5 CasM.287908SEQ ID NO: 50 SEQ ID NO: 95 (SEQ ID NO: 5) 6 CasM.288518 SEQ ID NO: 51SEQ ID NO: 96 (SEQ ID NO: 6) 7 CasM.293891 SEQ ID NO: 52 SEQ ID NO: 97(SEQ ID NO: 7) 8 CasM.294270 SEQ ID NO: 53 SEQ ID NO: 98 (SEQ ID NO: 8)9 CasM.294491 SEQ ID NO: 54 SEQ ID NO: 99 (SEQ ID NO: 9) 10 CasM.295047SEQ ID NO: 55 SEQ ID NO: 100 (SEQ ID NO: 10) 11 CasM.299588 SEQ ID NO:56 SEQ ID NO: 101 (SEQ ID NO: 11) 12 CasM.277328 SEQ ID NO: 57 SEQ IDNO: 102 (SEQ ID NO: 12) 13 CasM.297894 SEQ ID NO: 58 SEQ ID NO: 103 (SEQID NO: 13) 14 CasM.291449 SEQ ID NO: 59 SEQ ID NO: 104 (SEQ ID NO: 14)15 CasM.291449 SEQ ID NO: 59 SEQ ID NO: 105 (SEQ ID NO: 14) 16CasM.297599 SEQ ID NO: 60 SEQ ID NO: 106 (SEQ ID NO: 15) 17 CasM.297599SEQ ID NO: 60 SEQ ID NO: 107 (SEQ ID NO: 15) 18 CasM.286588 SEQ ID NO:61 SEQ ID NO: 108 (SEQ ID NO: 16) 19 CasM.286588 SEQ ID NO: 61 SEQ IDNO: 109 (SEQ ID NO: 16) 20 CasM.286910 SEQ ID NO: 62 SEQ ID NO: 110 (SEQID NO: 17) 21 CasM.286910 SEQ ID NO: 62 SEQ ID NO: 111 (SEQ ID NO: 17)22 CasM.292335 SEQ ID NO: 63 SEQ ID NO: 112 (SEQ ID NO: 18) 23CasM.292335 SEQ ID NO: 63 SEQ ID NO: 113 (SEQ ID NO: 18) 24 CasM.293576SEQ ID NO: 64 SEQ ID NO: 114 (SEQ ID NO: 19) 25 CasM.293576 SEQ ID NO:64 SEQ ID NO: 115 ((SEQ ID NO: 19) 26 CasM.294537 SEQ ID NO: 65 SEQ IDNO: 116 (SEQ ID NO: 20) 27 CasM.294537 SEQ ID NO: 65 SEQ ID NO: 117 (SEQID NO: 20) 28 CasM.298538 SEQ ID NO: 66 SEQ ID NO: 118 (SEQ ID NO: 21)29 CasM.298538 SEQ ID NO: 66 SEQ ID NO: 119 (SEQ ID NO: 21) 30CasM.19924 SEQ ID NO: 67 SEQ ID NO: 120 (SEQ ID NO: 22) 32 CasM.19952SEQ ID NO: 68 SEQ ID NO: 120 (SEQ ID NO: 23) 34 CasM.274559 SEQ ID NO:69 SEQ ID NO: 121 (SEQ ID NO: 24) 36 CasM.286251 SEQ ID NO: 70 SEQ IDNO: 122 (SEQ ID NO: 25) 38 CasM.288480 SEQ ID NO: 71 SEQ ID NO: 120 (SEQID NO: 26) 40 CasM.288668 SEQ ID NO: 72 SEQ ID NO: 123 (SEQ ID NO: 27)41 CasM.289206 SEQ ID NO: 73 SEQ ID NO: 121 (SEQ ID NO: 28) 43CasM.290598 SEQ ID NO: 74 SEQ ID NO: 121 (SEQ ID NO: 29) 45 CasM.290816SEQ ID NO: 75 SEQ ID NO: 124 (SEQ ID NO: 30) 47 CasM.295071 SEQ ID NO:76 SEQ ID NO: 122 (SEQ ID NO: 31) 49 CasM.295231 SEQ ID NO: 77 SEQ IDNO: 124 (SEQ ID NO: 32) 51 CasM.292139 SEQ ID NO: 78 SEQ ID NO: 125 (SEQID NO: 33) 52 CasM.292139 SEQ ID NO: 78 SEQ ID NO: 126 (SEQ ID NO: 33)54 CasM.279423 SEQ ID NO: 79 SEQ ID NO: 127 (SEQ ID NO: 34) 55CasM.20054 SEQ ID NO: 80 SEQ ID NO: 128 (SEQ ID NO: 35) 56 CasM.20054SEQ ID NO: 80 SEQ ID NO: 129 (SEQ ID NO: 35) 57 CasM.282673 SEQ ID NO:81 SEQ ID NO: 130 (SEQ ID NO: 36) 58 CasM.282673 SEQ ID NO: 81 SEQ IDNO: 131 (SEQ ID NO: 36) 59 CasM.282952 SEQ ID NO: 82 SEQ ID NO: 132 (SEQID NO: 37) 60 CasM.282952 SEQ ID NO: 82 SEQ ID NO: 133 (SEQ ID NO: 37)61 CasM.283262 SEQ ID NO: 83 SEQ ID NO: 134 (SEQ ID NO: 38) 62CasM.283262 SEQ ID NO: 83 SEQ ID NO: 135 (SEQ ID NO: 38) 63 CasM.284833SEQ ID NO: 84 SEQ ID NO: 136 (SEQ ID NO: 39) 64 CasM.284833 SEQ ID NO:84 SEQ ID NO: 137 (SEQ ID NO: 39) 65 CasM.287700 SEQ ID NO: 85 SEQ IDNO: 138 ((SEQ ID NO: 40) 66 CasM.291507 SEQ ID NO: 86 SEQ ID NO: 139(SEQ ID NO: 41) 67 CasM.291507 SEQ ID NO: 86 SEQ ID NO: 140 (SEQ ID NO:41) 68 CasM.293410 SEQ ID NO: 87 SEQ ID NO: 141 (SEQ ID NO: 42) 69CasM.293410 SEQ ID NO: 87 SEQ ID NO: 142 (SEQ ID NO: 42) 70 CasM.295105SEQ ID NO: 88 SEQ ID NO: 143 (SEQ ID NO: 43) 71 CasM.295105 SEQ ID NO:88 SEQ ID NO: 144 (SEQ ID NO: 43) 72 CasM.295187 SEQ ID NO: 89 SEQ IDNO: 145 (SEQ ID NO: 44) 73 CasM.295187 SEQ ID NO: 89 SEQ ID NO: 146 (SEQID NO: 44) 74 CasM.295929 SEQ ID NO: 90 SEQ ID NO: 147 (SEQ ID NO: 45)75 CasM.295929 SEQ ID NO: 90 SEQ ID NO: 148 (SEQ ID NO: 45)

TABLE 3 provides exemplary compositions comprising D2S effector proteinsand sgRNAs. Each row in TABLE 3 represents an exemplary composition. Insome instances, the sgRNA comprises a nucleobase sequence of any one ofSEQ ID NOs: 22-33 as shown in TABLE 3. In some instances, the nucleobasesequence of the sgRNA is at least 65%, at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, or atleast 98%, at least 99%, or 100% identical to any one of SEQ ID NO:22-SEQ ID NO: 33.

TABLE 3 Exemplary Compositions of D2S Effector Protein and sgRNAComp. No Effector protein SgRNA 31 CasM.19924 (SEQ ID NO: 22)SEQ ID NO: 149 33 CasM.19952 (SEQ ID NO: 23) SEQ ID NO: 149 35CasM.274559 (SEQ ID NO: 24) SEQ ID NO: 150 37CasM.286251 (SEQ ID NO: 25) SEQ ID NO: 151 39CasM.288480 (SEQ ID NO: 26) SEQ ID NO: 149 42CasM.289206 (SEQ ID NO: 28) SEQ ID NO: 150 44CasM.290598 (SEQ ID NO: 29) SEQ ID NO: 150 46CasM.290816 (SEQ ID NO: 30) SEQ ID NO: 152 48CasM.295071 (SEQ ID NO: 31) SEQ ID NO: 151 51CasM.295231 (SEQ ID NO: 32) SEQ ID NO: 152 53CasM.292139 (SEQ ID NO: 33) SEQ ID NO: 153 or RNA sequence:UUAUUAGAAAUGAAAUAUU UUCUAAUGGGGUUGUUGGA AAGAGCUUUUACUGAAAUUUGUAAAGGUGCCCUGAACU UGAGAAUUGAAAAAUUACU CGAGGAAAUGGUACAUCCAACUAUUAAAUACUCGUAUU  GCU (SEQ ID NO: 937)

In some instances, a guide nucleic acid can comprise a nucleotidesequence (e.g., a repeat sequence) as shown in TABLE 38. In someinstances, a crRNA or a sgRNA comprises a repeat sequence as shown inTABLE 38. In some instances, a guide nucleic acid comprises a sequencethat is at least 70%, at least 75%, at least 80%, at least 85%, at least90%, or at least 95% identical to a sequence in TABLE 38. In someinstances, a guide nucleic acid comprises a sequence that is at least70%, at least 75%, at least 80%, at least 85%, at least 90%, or at least95% identical to any one of SEQ ID Nos: 630, 641, or 827-929. In someinstances, a crRNA or a sgRNA comprises a sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, or at least 95%identical to any one of SEQ ID Nos: 630, 641, or 827-929. In someinstances, guide nucleic acids comprise at least 8, at least 9, at least10, at least 11, at least 12, at least 13, at least 14, at least 15, atleast 16, at least 17, at least 18, at least 19, at least 20 at least25, at least 30, or at least 35 contiguous nucleotides of a nucleotidesequence in TABLE 38.

TABLE 38 Examples Of Repeat Sequences Associated With VariousEffector Proteins Type of Associated Guide Effector SEQ Nucleic Seq IDID NO: Acid Examples of Repeat Sequences NO 1 crRNACGUUGCAGCUCGCACGUUGGCACUGGUUGAAGG 827 1 crRNACGUUGCAGCUCGCACGUUGGCACUGGGUUGAAG 828 G 1 SgRNA UUGGCACUGGUUGAAGG 829 1sgRNA CACUGGUUGAAGG 830 2 crRNA GUUGCAACUCACGCGCGUAUGUGGCUUGAAGG 831 3crRNA GUUGCAAUUCAUAUCUCCGGGUGGAUUGAAGG 832 4 sgRNA AGCGUGUGGCUUGAAGG 8334 sgRNA UGUGGCUUGAAGG 834  4, 10 crRNA GUUGCAGCGUGCGCGAGCGUGUGGCUUGAAGG835 5 crRNA GUUGCAACUCGCACGUGAAUGCGACUUGAAGG 836 5 sgRNAUGAAUGCGACUUGAAGG 837 6 crRNA GAUGCAACUCGUGUGUAUGUGCGAGUUGAAGG 838 7crRNA GACGCAACUCGCGCGCGGGCAUGUAUUGAGGG 839 8 crRNAGAUGCAUCUGACACAGCUGGGUGAGUUGAAGG 840 8 sgRNA GCUGGGUGAGUUGAAGG 841 9crRNA GUUGCAACACAUGUAUGUGGGUGAGUUGAAGG 842 11 crRNAGUUGCAAUUUGUAUACGAGUGUGACUUGAAGG 843 12 crRNAGCUGCAACACGCGCGGGUACGCGGGUUGAAGG 844 13 crRNAGUUGCAACUCGCACGUUGGCACUGAUUGAAGG 845 14 crRNAGCUGUAGCCCUGCUCAAAUUGUAGGGCGCAUGC 846 AGG 14, 15, 16 crRNAGUUGUAGUCGACCUGAAUCUGUGGGGUGCUUAC 847 AGG 14, 16, 19 sgRNAUGUGGGGUGCUUACAGG 848 16 crRNA GGUGUAUGUAACCGCAAUUUGAAGGGUGCAUAC 849 AGG17, 20 crRNA GUUGGAAUCGACCUUAAUUUGAGGUGUGCUUAC 850 AGG 18 crRNAGCUGAAAGAGCAGAGAAUUUGUUGUGUGCAUA 851 CAGG 19 crRNAGUUGGAGUCGGCUUGAAUCUGCGGGGUGCUUAC 852 AGG 21 crRNAGUUGUAAGAGACCCGAAUUUUAGCUGUGUAUAC 853 AGG 22 crRNAGUUGUGAAUGCAGGCAUUUUUGAUGGUAAAUC 854 CAAC 22, 23, 24, 25, sgRNAUGGUACAUCCAAC 630 26, 28, 29. 30, 31, 32, 33, 34, 207, 208, 217,219, 222, 229, 236, 237, 238, 23 crRNA ACUGUCAGACAAUGCAAAAUGUGUGGUACAUCC855 AAC 23 sgRNA UGGUACAUCC 856 23 sgRNAUGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUU 857 UAUUGCACUCGGGAAGUACCAUUACAUCCAAC23 sgRNA UGGUACAUCCAACUCUAGGCGCC 858 23 sgRNA AAUGGUACAUCCAAC 859 23sgRNA UGGUACAUCCAACUCUAGGC 860 23 sgRNA UGGUACAUCCAACUCUAGGCGC 861 23sgRNA UGGUACAUCCAACUCUAGGCG 862 23 sgRNA UGGUACAUCCAACUCUAGG 863 23sgRNA AAAUGGUACAUCCAAC 864 23 sgRNA UGGUACAUCCAACUCU 865 23 sgRNAUGGUACAUCCAACUC 866 23 sgRNA UGGUACAUCCAACU 867 23 sgRNAUGGUACAUCCAACUCUAG 868 23 sgRNA UGGUAUAUCCAAC 869 23 sgRNAUGGUACAUCCAACUCUA 870 23 sgRNA AUGGUACAUCCAAC 871 23 sgRNA UGGUACAUCCAA872 23 sgRNA UGGUACAUCCA 873 24, 34, 226 crRNAGCUGUCAGUAGUAGUAAAAAUGGGGGUACAUCC 874 AAC 25, 31 crRNAACUGUCAGUACAUGCAAAAAUGAGGGUACAUCC 875 AAC 26 crRNAACUGUCAGACAAUGCAAAAUGAGUGGUACAUCC 876 AAC 27 crRNAGCUGUUAGAACAUACAAAAUGAAAGGUACAUCC 877 AAC 28 crRNAGCUGCAUGUCAUGGCAAAAGGAAAGGUACAUCC 878 AAC 29 crRNAGCUGUCAGACACCUAAAAAAUGAGGGUACAUCC 879 AAC 30, 32 crRNAGCUGUGAGUCACAGUAAAAAUGAAGGUAUAUCC 880 AAC 33 crRNAGAUGUAUAUGCUAUGAUUUUGUAUGGUACAUC 881 CAAC 34, 211, 230 crRNAGUUGCAGAACCCGAAUAGACGAAUGAAGGAAUG 882 CAAC 35 crRNAGUUGAGCUCUGCAUUACGCAGAUGAAUGACGAG 883 35, 36, 38, 39, crRNAGAUAUAUCUUGUAUGCAUAUGUAGGUUGUGAG 884 41,42, 43,44, 212 35,36,38, 40,SgRNA GUUGCAACUUACGCAUAGGUGUAAAAUACGAGG 885 41, 42, 43, 210 36 crRNAGAUGCAACUUAGAUGCAUAUGUAAGUUGUGAG 886 36,37,38,41, crRNAGUUGCAAUGAACGUAUGUGCAUGAGGUGUGAG 887 42, 43, 45 36, 38, 42, 43, sgRNAGUUGCAAUUCGUAUGCGCAGGUAAGUUUCGAG 888 234 36, 37, 38, 42, sgRNAUGUGCAUGAGGUGUGAG 889 43, 45, 37 crRNA GUUGCAAUCUGCGUACAGGCGUAAGAUGUGAG890 37 sgRNA CAGGCGUAAGAUGUGAG 891 38, 43 crRNAGAUCAUAUCUGCUUGUAUGGGUAUGCUGCGAG 892 38 sgRNA UAUGGGUAUGCUGCGAG 89339, 41 crRNA GUUGCAACUUACGCAUAGGUGUAAAAUACGAG 894 40 crRNAGAUUAUAUCUGCUUGUAUGGGUAUACUGCGAG 895 42 crRNAUCAGCUCACAACCUACAUAUGCAUACAAGAUAU 896 AUCGU 44 sgRNA CAUAUGUAGGUUGUGAG897 44 sgRNA UGUAGGUUGUGAG 898 45 sgRNA CAUGAGGUGUGAG 899 202, 205, 213,sgRNA AGGUACAUCCAAC 641 233 203, 209 sgRNA UGCGGUGUAAUUCGAGG 900 204crRNA GAUGUGAACGACCUUUUUUUGCGGUGUGCUUCG 901 AGG 206 crRNAGGUGGAUAUCAUCUUAAAAAGUGAGGUACAUCC 902 AAC 209 crRNAGGUGUGAACGACCUUUUUUUGCGGUGUAAUUCG 903 AGG 209 sgRNA UUGCGGUGUACUUCGAGG904 211 sgRNA AGAAGAAGGAUUGGGAC 905 212 crRNAAAUGUGAACGACCUUCUUUUGCGGUGUACUUCG 906 AGG 214 sgRNA AAGGUUGAUACAGC 907215 crRNA GCUGUAAGUCAUGGAAAAAUGGUGAGUACAUCC 908 AAC 215 sgRNAAUGGUGAGUACAUCCAAC 909 216 sgRNA GAGCACAUCCAAC 910 217 sgRNAGGGUACAUCCAAC 911 218 crRNA GUUGCGUUUGCCCGUGAUUUCGGGUGUGUAUAC 912 AGG220 sgRNA AGGUAUAUCCAAC 913 221 crRNA GGCGUAUGUCUACCUGAAAAAGAAGGUAUAUCC914 AAC 223 sgRNA GGCUACAUACAGC 915 224 crRNAGGUGUAUGUGCACCAUAUAUGUAGGUGACAUAC 916 AGC 226, 235 sgRNAAAAACAAGGAUUGAAAC 917 227 crRNA GAUGUGAACGACCUUUUUUUGCGGUGUACUUCG 918AGG 227 sgRNA GUGUACUUCGAGG 919 228 crRNAGAUGUAAAUCAUCUAUAAAAGAAAGGUACAUCC 920 AAC 228 sgRNA GGUACAUCCAAC 921 230sgRNA CGUACGUGGAUUGAAAC 922 231 crRNA GCUGCACUGCACCGCCCAUUGAUGGUGUGCUCU923 AGG 232 crRNA AUUGUAGGCGACCUUUUUUUGCGAUGUAGUUCG 924 AGG 232 sgRNAAUGUAGUUCGAGG 925 233 crRNA AGUGUAUGAUUACCUGUAGUAUGAGGUACAUCC 926 AAC239 sgRNA GCUGCAAGAGCUCCUAAUUUGAGGGGUGCAUAC 927 AGG 240 crRNAGAUAGUUUUAACUUCCAUUUGAAAUGUAAAUG 928 CAAC 240 sgRNA AUGUAAAUGCAAC 929

In some instances, a guide nucleic acid can comprise a nucleotidesequence as shown in TABLE 40. In some instances, a sgRNA comprises arepeat sequence as shown in TABLE 40. In some instances, a guide nucleicacid comprises a sequence that is at least 70%, at least 75%, at least80%, at least 85%, at least 90%, or at least 95% identical to a sequencein TABLE 40. In some instances, a guide nucleic acid comprises asequence that is at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, or at least 95% identical to any one of SEQ ID Nos: 645,932, 857, 933, 934, 935, 936, 737, 747, 750, 761, 763, 765, 769, 773,780, 782, 785 or 941. In some instances, a sgRNA comprises a sequencethat is at least 70%, at least 75%, at least 80%, at least 85%, at least90%, or at least 95% identical to any one of SEQ ID Nos: 645, 932, 857,933, 934, 935, 936, 737, 747, 750, 761, 763, 765, 769, 773, 780, 782,785 or 941. In some instances, guide nucleic acids comprise at least 8,at least 9, at least 10, at least 11, at least 12, at least 13, at least14, at least 15, at least 16, at least 17, at least 18, at least 19, atleast 20 at least 25, at least 30, or at least 35 contiguous nucleotidesof a nucleotide sequence in TABLE 40.

TABLE 40 Examples Of sgRNA Sequences SEQ ID sgRNA sequence NOUGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUUUAUUGCACUCGGGAAG 645UACCAUUUCUCAGAAAUGGUACAUCCAACUGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUUUAUUGCACUCGGGAAG 932UACCAUUUCUCAGAAAUGGUAUAUCCAACUGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUUUAUUGCACUCGGGAAG 857 UACCAUUACAUCCAACAUGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUUUAAUGCACUCGGGAG 933 AAAAACAUGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUUUAAUGCACUCGGGAA 934 GUACCGAAAAUCCAACAUGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUUUAAUGCACUCGGGAA 935GUACCUUUUCUCAGAAAAGGUACAUCCAACAUGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUUUAAUGCACUCGGGAA 936GUACCUUUUCUCAGAAACCAAC AAUAGGAUUUAUCCUAUGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUU737 UAUUGCACUCGGGAAGUACCUUAUUUCAUUGAGCAACAGAAAGGGUACA UCCAACGGGGCAGUUGGAUGCCCUUAUGCUGAGGGAUUAUUCCACUCGGCAAGUA 747CCAAUAAUAAUGGAUGUGAAAAGGUACAUCCAACCGGGUGGUUGCACAUCCGAAGGGUGAGGAUUUAUUCACUCACUAAUACU 750ACAAAUGGAAAAAUUUAAAGGAAAAUGUAAAUGCAACUGAAAUAUUGAUUGAGGUCGCCGUUUACGUUGCGUCACAAGGGCGCGCG 761GGCGACCGAAGGCCGAUCUGUACGGCCUGCAGGUUGAGAAGGCACAUAUUAGAGGAAAAUUGCUUCCCUUUGUGUUCGCUCACCGAGUAUUCCUUGUUAUUUGCGGCAAGAAACUGUCUUAAUUGUUUGAAAGGGUGCAUACAGGAAGCAACCGCGUACACGCGGACGAACGGCCGACCUGCUCGGCCUGAAGGU 763UGAGAAGGUUAUGUAUAAGAGGAGAAAAUCCCCCUUCAUAAUCGCUCACCAAGCUCCCAAUUUACAUAUUUUGAAAGGGCGCAUGCAGGUAUUGCGCUAGCCAUAAUGGCAAUCGCGUACAGGCAACUGAAGGCCGACC 765UGUACGGCCUUAAGGUUGAGAAGGCACAUGUAAGUGGAAAAAUGCUUUCCCGUUGUGUUCGCUCACCAAGCACACACGUUUGAAAUGUGGGGUGCUUAC AGGAGUAUGAGGCCGCCGAUAAACGUUUCGCUAGCCUGACAGGCAAUCGCGAA 769CGGGCGGCUGAAGGCCGACCUGUACGGCCUGAAGGAUGAGAAGGCACAUAUAAGUGGAAAAUUGCUUCCCGUUGUGUUCGCUCACCAGGUACUCCUUAAUUUGAAAGCUGCAAGAGCUCCUAAUUUGAGGGGUGCAUACAGGAACUGCCGGUAAGAUUACGAUAGCCGAAAGGCAAUUGCGUAUGCGGCAG 773UUAAGGCCGGCUCGAACGGCCUGAAGGUUGAGUUUAAAGUCACAUAUAAGCGGAAAAAUCAGAUUUCCCAUUGUGUUCGCUCACCAAUACGCGCAAAUU UGAAAAUGUAGUUCGAGGACCGAGGCCGCGAAAAACACAACGCUAGCCGAAAGGCAAUCGCGGGUGCG 780CGGCCGAAGGCCGACUAGAGCGGCCUGAAGGUUGAGAAGCGUGCAUGUAAACGGCAGAAAAAAUGCCUUUUGUACGCGCUCACCGAACACGUCUGAGCG GUUUGAAAGGUGUGCUCUAGGGGGGUUGUUGGAAACCCUUAUGCUGAGGGAUUAUUCCACUCGGUAAGUA 782CCUUAAAUAGUUAUAGAAAGAUGUAAAUCAUCUAUAAAAGAAAGGUACA UCCAACAAGAUAUGAAUAGGAGUAUUCCUAUGGGGCAGUUGGUUGCCCUUAGCCU 785GAGGUAUUUAAUGCACUCGGGAAGUACUUUCAACAGUAUCCGUUAGAAA AGGUACAUCCAACAUGAAUAGGAUUCGUCCUAUGGGGCAGUUGGUUGCCCUUAGCCUGAGGC 941AUUUAUUGCACUCGGGAAGUACCAUUUCUCAGAAAUGGUACAUCCAAC

In some embodiments, a guide nucleic acid can comprise a nucleotidesequence that is shared among the exemplary guide nucleic acidsdescribed herein. For example, in some embodiments, a guide nucleic acidcomprises a repeat sequence having the nucleotide sequence UGGUACAUCC(SEQ ID NO: 942). In some embodiments, a guide nucleic acid comprises arepeat sequence that is at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, or at least 95% identical to UGGUACAUCC (SEQ IDNO: 942). Such a repeat sequence includes, for example, the nucleotidesequence of UGGUAUAUCC (SEQ ID NO: 943).

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 1-13; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 46; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 91. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 1. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 1. In some instances, the crRNA andtracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 1-13; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 47; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 92. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 2. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 2. In some instances, the crRNA andtracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 1-13; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 48; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 93. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 3. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 3. In some instances, the crRNA andtracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 1-13; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 49; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 94. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 4. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 4. In some instances, the crRNA andtracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 1-13; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 50; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 95. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 5. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 5. In some instances, the crRNA andtracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 1-13; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 51; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 96. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 6. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 6. In some instances, the crRNA andtracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 1-13; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 52; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 97. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 7. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 7. In some instances, the crRNA andtracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 1-13; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 53; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 98. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 8. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 8. In some instances, the crRNA andtracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 1-13; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 54; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 99. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 9. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 9. In some instances, the crRNA andtracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 1-13; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 55; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 100. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 10. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 10. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 1-13; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 56; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 101. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 11. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 11. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 1-13; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 57; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 102. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 12. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 12. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 1-13; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 58; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 103. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 13. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 13. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 14-21; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 59; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 104. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 14. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 14. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 14-21; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 59; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 105. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 14. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 14. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 14-21; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 60; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 106. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 15. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 15. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 14-21; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 60; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 107. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 15. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 15. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 14-21; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 61; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 108. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 16. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 16. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 14-21; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 61; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 109. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 16. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 16. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 14-21; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 62; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 110. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 17. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 17. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 14-21; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 62; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 111. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 17. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 17. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 14-21; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 63; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 112. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 18. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 18. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 14-21; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 63; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 113. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 18. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 18. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 14-21; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 64; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 114. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 19. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 19. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 14-21; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 64; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 115. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 19. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 19. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 14-21; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 65; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 116. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 20. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 20. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 14-21; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 65; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 117. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 20. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 20. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 14-21; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 66; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 118. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 21. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 21. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 14-21; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 66; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 119. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 21. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 21. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 67; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 120. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 22. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 22. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 68; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 120. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 23. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 23. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 69; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 121. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 24. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 24. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 70; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 122. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 25. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 25. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 71; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 120. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 26. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 26. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 72; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 123. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 27. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 27. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 73; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 121. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 28. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 28. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 74; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 121. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 29. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 29. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 75; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 124. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 30. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 30. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 76; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 122. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 31. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 31. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 77; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 124. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 32. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 32. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 78; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 125. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 33. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 33. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 78; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 126. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 33. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 33. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 79; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 127. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 34. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 34. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 35-45; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 80; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 128. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 35. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 35. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 35-45; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 80; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 129. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 35. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 35. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 35-45; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 81; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 130. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 36. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 36. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 35-45; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 81; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 131. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 36. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 36. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 35-45; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 82; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 132. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 37. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 37. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 35-45; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 82; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 133. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 37. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 37. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 35-45; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 83; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 134. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 38. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 38. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 35-45; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 83; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 135. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 38. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 38. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 35-45; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 84; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 136. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 39. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 39. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 35-45; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 84; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 137. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 39. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 39. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 35-45; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 85; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 138. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 40. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 40. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 35-45; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 86; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 139. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 41. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 41. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 35-45; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 86; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 140. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 41. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 41. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 35-45; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 87; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 141. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 42. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 42. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 35-45; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 87; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 142. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 42. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 42. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 35-45; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 88; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 143. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 43. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 43. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 35-45; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 88; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 144. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 43. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 43. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 35-45; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 89; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 145. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 44. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 44. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 35-45; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 89; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 146. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 44. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 44. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 35-45; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 90; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 147. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 45. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 45. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 35-45; a crRNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto SEQ ID NO: 90; and a tracrRNA comprising a nucleobase sequence thatis at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,at least 95%, or 100% identical to SEQ ID NO: 148. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 45. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 45. In some instances, the crRNAand tracrRNA are linked as a sgRNA.

A guide nucleic acid can comprise RNA, DNA, or a combination thereof.The term “gRNA” refers to a guide nucleic acid comprising RNA. A gRNAmay include nucleosides that are not ribonucleic. In some embodiments,all nucleosides in a gRNA are ribonucleic. In some embodiments, some ofthe nucleosides in a gRNA are not ribonucleic. In embodiments wherenucleosides in a gRNA are not ribonucleic, non-ribonucleic nucleosidesmay be naturally occurring or non-naturally-occurring nucleosides. Insome embodiments, inter-nucleoside links are phosphodiester bonds. Insome embodiments, the inter-nucleoside link between at least twonucleosides in a guide nucleic acid is not a phosphodiester bond. Insome embodiments, the inter-nucleoside link between at least twonucleosides is a non-natural inter nucleoside linkage. Non-naturalinter-nucleoside linkages include phosphorous and non-phosphorousinter-nucleoside linkages. Phosphorous inter-nucleoside linkages includephosphorothioate linkages and thiophosphate linkages. Aninter-nucleoside linkage may comprise a “C3 spacer”. C3 spacers areknown to the skilled person as comprising a chain of three carbon atoms.

Guide nucleic acids may be modified to improve genome editingefficiency, increase stability, reduce off-target effects, and/orincrease the affinity of the guide nucleic acid for an effector proteindisclosed herein.

Modifications may include non-natural nucleotides and/or non-naturallinkages. In addition or alternatively, one or more sugar moieties ofthe guide nucleic acid may be modified. Such sugar moiety modificationsmay include 2′-O-methyl (2′OMe), 2′-O-methyoxy-ethyl and 2′ fluoro. Insome embodiments, editing efficiency, or genome editing efficiency, isdetermined by analyzing the frequency of indel mutations in a nucleicacid or gene knockout. In some embodiments, the use of a flow cytometeror next generation sequencing may be used to analyze cells for indelmutations or gene knockout. In other embodiments, off-target effects maybe detected using a flow cytometer, next generation sequencing, orCIRCLE-seq.

In some preferred embodiments, the first 3 nucleosides (or one of thefirst 3 nucleosides, or a combination of the first 3 nucleosides) fromthe 5′ end of the repeat region comprise a 2′-O-methyl modification andthe linkages between the 3 nucleosides at the 3′ end of the spacerregion comprise phosphorothioate linkages.

In some embodiments, the first nucleoside at the 5′ end of the repeatregion comprises a 2′-O-methyl modification. In some embodiments, thefirst two nucleosides at the 5′ end of the repeat region comprise2′-O-methyl modifications. In some embodiments, the first threenucleosides at the 5′ end of the repeat region comprise 2′-O-methylmodifications. In some embodiments, the last nucleoside at the 3′ end ofthe spacer region comprises a 2′-O-methyl modification. In someembodiments, the last two nucleosides at the 3′ end of the spacer regioncomprise 2′-O-methyl modifications. In some embodiments, the last threenucleosides at the 3′ end of the spacer region comprise 2′-O-methylmodifications.

In some embodiments, the first 3 nucleosides (or one of the first 3nucleosides, or a combination of the first 3 nucleosides) from the 5′end of the repeat region and the 3 nucleosides at the 3′ end of thespacer region comprise a 2′-O-methyl modification, and the linkagesbetween the 3 nucleosides at the 3′ end of the spacer region comprisephosphorothioate linkages.

In some embodiments, the first 3 nucleosides (or one of the first 3nucleosides, or a combination of the first 3 nucleosides) from the 5′end of the repeat region and the 3 nucleosides at the 3′ end of thespacer region comprise a 2′-fluoro modification.

In some embodiments, the first nucleoside at the 5′ end of the repeatregion comprises a 2′ fluoro modification. In some embodiments, thefirst two nucleosides at the 5′ end of the repeat region comprise 2′fluoro modifications. In some embodiments, the first three nucleosidesat the 5′ end of the repeat region comprise 2′ fluoro modifications. Insome embodiments, the last nucleoside at the 3′ end of the spacer regioncomprises a 2′ fluoro modification. In some embodiments, the last twonucleosides at the 3′ end of the spacer region comprise 2′ fluoromodifications. In some embodiments, the last three nucleosides at the 3′end of the spacer region comprise 2′ fluoro modifications. In preferredembodiments, the last three nucleosides at the 3′ end of the spacerregion comprise 2′ fluoro modifications.

In preferred embodiments, the first two nucleosides at the 5′ end of therepeat region comprise 2′-O-methyl modifications, the first twonucleosides at the 5′ end of the repeat are linked by a phosphorothioatelinkage, and the last three nucleosides at the 3′ end of the spacerregion comprise 2′ fluoro modifications.

In some embodiments, the linkage between the two nucleosides at the 5′end of the repeat region comprises a 3C spacer and the linkage betweenthe two nucleosides at the 3′ end of the spacer region comprises a 3Cspacer.

In some embodiments, the guide nucleic acid comprises ribonucleicnucleosides and deoxyribonucleic nucleosides. In some embodiments, theguide nucleic acid is a guide RNA wherein the first, eighth and ninethnucleosides from the 5′ end of the spacer region and the fournucleosides at the 3′ end of the spacer region are deoxyribonucleicnucleosides.

In some embodiments, the guide nucleic acid comprises a polyA tail. Insome preferred embodiments, the guide nucleic acid comprises a polyAtail at the 3′ end of the spacer region.

In some embodiments, the engineered guide nucleic acid comprises atleast 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,26, 27, 28, 29, or 30 contiguous nucleotides that are complementary to aeukaryotic sequence. Such a eukaryotic sequence is a sequence ofnucleotides that is present in a host eukaryotic cell. Such a sequenceof nucleotides is distinguished from nucleotide sequences present inother host cells, such as prokaryotic cells, or viruses. Said sequencespresent in a eukaryotic cell can be located a gene, an exon, an intron,a non-coding (e.g., promoter or enhancer) region, a selectable marker,tag, signal, and the like. In some cases, the engineered guide nucleicacid comprises at least 10 contiguous nucleotides that are complementaryto a eukaryotic sequence. In some cases, the engineered guide nucleicacid comprises at least 11 contiguous nucleotides that are complementaryto a eukaryotic sequence. In some cases, the engineered guide nucleicacid comprises at least 12 contiguous nucleotides that are complementaryto a eukaryotic sequence. In some cases, the engineered guide nucleicacid comprises at least 13 contiguous nucleotides that are complementaryto a eukaryotic sequence. In some cases, the engineered guide nucleicacid comprises at least 14 contiguous nucleotides that are complementaryto a eukaryotic sequence. In some cases, the engineered guide nucleicacid comprises at least 15 contiguous nucleotides that are complementaryto a eukaryotic sequence. In some cases, the engineered guide nucleicacid comprises at least 16 contiguous nucleotides that are complementaryto a eukaryotic sequence. In some cases, the engineered guide nucleicacid comprises at least 17 contiguous nucleotides that are complementaryto a eukaryotic sequence. In some cases, the engineered guide nucleicacid comprises at least 18 contiguous nucleotides that are complementaryto a eukaryotic sequence. In some cases, the engineered guide nucleicacid comprises at least 19 contiguous nucleotides that are complementaryto a eukaryotic sequence. In some cases, the engineered guide nucleicacid comprises at least 20 contiguous nucleotides that are complementaryto a eukaryotic sequence. In some cases, the engineered guide nucleicacid comprises at least 21 contiguous nucleotides that are complementaryto a eukaryotic sequence. In some cases, the engineered guide nucleicacid comprises at least 22 contiguous nucleotides that are complementaryto a eukaryotic sequence. In some cases, the engineered guide nucleicacid comprises at least 23 contiguous nucleotides that are complementaryto a eukaryotic sequence. In some cases, the engineered guide nucleicacid comprises at least 24 contiguous nucleotides that are complementaryto a eukaryotic sequence. In some cases, the engineered guide nucleicacid comprises at least 25 contiguous nucleotides that are complementaryto a eukaryotic sequence. In some cases, the engineered guide nucleicacid comprises at least 26 contiguous nucleotides that are complementaryto a eukaryotic sequence. In some cases, the engineered guide nucleicacid comprises at least 27 contiguous nucleotides that are complementaryto a eukaryotic sequence. In some cases, the engineered guide nucleicacid comprises at least 28 contiguous nucleotides that are complementaryto a eukaryotic sequence. In some cases, the engineered guide nucleicacid comprises at least 29 contiguous nucleotides that are complementaryto a eukaryotic sequence. In some cases, the engineered guide nucleicacid comprises at least 30 or more contiguous nucleotides that arecomplementary to a eukaryotic sequence.

Effector Protein-sgRNA Complexes

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 149. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 22. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 22.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 149. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 23. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 23.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 150. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 24. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 24.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 151. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 25. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 25.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 149. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 26. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 26.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 150. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 28. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 28.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 150. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 29. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 29.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 152. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 30. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 30.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 151. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 31. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 31.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 152. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 32. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 32.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 153. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 33. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 33.

TABLE 13 provides exemplary compositions comprising D2S effectorproteins, cr/sgRNAs (crRNA or sgRNA), and tracrRNAs. Each row in TABLE13 represents an exemplary composition. In some instances, the cr/sgRNAand/or tracrRNA comprises a nucleobase sequence of any one of thesequences as shown in TABLE 13. In some instances, the nucleobasesequence of the cr/sgRNAs is at least 65%, at least 70%, at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, orat least 98%, at least 99%, or 100% identical to any one of the cr/sgRNAsequences present in TABLE 13. In some instances, the nucleobasesequence of the tracrRNA is at least 65%, at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, or atleast 98%, at least 99%, or 100% identical to any one of the tracrRNAsequences present in TABLE 13. In some instances, a D2S effector proteincan comprise a sequence at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, at least 98%, at least 99% or 100% identical toany one of SEQ ID NOs: 1-45, 202-293, or 728-731. In some embodiments,the compositions are used to generate a modification of a target nucleicacid. In some embodiments, the target nucleic acid comprises a PAMsequence selected from any one of the PAM sequences in TABLE 13.

TABLE 14 provides exemplary compositions comprising D2S effectorproteins, cr/sgRNAs, and tracrRNAs. Each row in TABLE 14 represents anexemplary composition. In some instances, the cr/sgRNA and/or tracrRNAcomprises a nucleobase sequence of any one of the sequences as shown inTABLE 14. In some instances, the nucleobase sequence of the cr/sgRNA isat least 65%, at least 70%, at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, at least 97%, or at least 98%, at least 99%, or100% identical to any one ofthe cr/sgRNA sequences present in TABLE 14.In some instances, the nucleobase sequence of the tracrRNA is at least65%, at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 97%, or at least 98%, at least 99%, or 100%identical to any one of the tracrRNA sequences present in TABLE 14. Insome instances, a D2S effector protein can comprise a sequence at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least98%, at least 99% or 100% identical to any one of SEQ ID NOs: 1-45,202-293, or 728-731. In some embodiments, the compositions are used togenerate a modification of a target nucleic acid. In some embodiments,the target nucleic acid comprises a PAM sequence selected from any oneof the PAM sequences in TABLE 14.

TABLE 15 provides exemplary compositions comprising D2S effectorproteins, cr/sgRNAs, and tracrRNAs. Each row in TABLE 15 represents anexemplary composition. In some instances, the cr/sgRNA and/or thetracrRNA comprises a nucleobase sequence of any one of the sequences asshown in TABLE 15. In some instances, the nucleobase sequence of thecr/sgRNA is at least 65%, at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, or at least 98%, atleast 99%, or 100% identical to any one of SEQ ID NOs: 463, 464, and466. In some instances, the nucleobase sequence of the tracrRNA is atleast 65%, at least 70%, at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, at least 97%, or at least 98%, at least 99%, or100% identical to SEQ ID NO: 465. In some instances, a D2S effectorprotein can comprise a sequence at least 75%, at least 80%, at least85%, at least 90%, at least 95%, at least 98%, at least 99% or 100%identical to any one of SEQ ID NOs: 223, 224, or 214.

TABLE 16 provides exemplary compositions comprising D2S effectorproteins, and cr/sgRNAs. Each row in TABLE 16 represents an exemplarycomposition. In some instances, the cr/sgRNA comprise a nucleobasesequence of any one of the sequences as shown in TABLE 16. In someinstances, the nucleobase sequence of the cr/sgRNA is at least 65%, atleast 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 95%, at least 97%, or at least 98%, at least 99%, or 100%identical to SEQ ID NO: 180 or 467. In some instances, a D2S effectorprotein can comprise a sequence at least 75%, at least 80%, at least85%, at least 90%, at least 95%, at least 98%, at least 99% or 100%identical to any one of SEQ ID NOs: 1-45, 202-293, or 728-731. In someembodiments, the compositions are used to generate a modification of atarget nucleic acid. In some embodiments, the target nucleic acidcomprises a PAM sequence selected from SEQ ID NOs: 369 or 370.

TABLE 17 provides exemplary compositions comprising D2S effectorproteins, and cr/sgRNAs. Each row in TABLE 17 represents an exemplarycomposition. In some instances, the cr/sgRNA comprise a nucleobasesequence of any one of the sequences as shown in TABLE 17. In someinstances, the nucleobase sequence of the cr/sgRNA is at least 65%, atleast 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 95%, at least 97%, or at least 98%, at least 99%, or 100%identical to any one of SEQ ID NOs: 468-481. In some instances, a D2Seffector protein can comprise a sequence at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to any one of SEQ ID NOs: 1-45, 202-293, or 728-731. Insome embodiments, the compositions are used to generate a modificationof a target nucleic acid. In some embodiments, the target nucleic acidcomprises a PAM sequence selected from SEQ ID NOs: 368-371.

TABLE 18 provides exemplary compositions comprising D2S effectorproteins, and cr/sgRNAs. Each row in TABLE 18 represents an exemplarycomposition. In some instances, the cr/sgRNA comprise a nucleobasesequence of any one of the sequences as shown in TABLE 18. In someinstances, the nucleobase sequence of the cr/sgRNA is at least 65%, atleast 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 95%, at least 97%, or at least 98%, at least 99%, or 100%identical to any one of the cr/sgRNA sequences present in TABLE 18. Insome instances, a D2S effector protein can comprise a sequence at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least98%, at least 99% or 100% identical to SEQ ID NO: 23.

TABLE 19 provides exemplary compositions comprising D2S effectorproteins, and cr/sgRNAs. Each row in TABLE 19 represents an exemplarycomposition. In some instances, the cr/sgRNA comprise a nucleobasesequence of any one of the sequences as shown in TABLE 19. In someinstances, the nucleobase sequence of the cr/sgRNA is at least 65%, atleast 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 95%, at least 97%, or at least 98%, at least 99%, or 100%identical to any one of the cr/sgRNA sequences present in TABLE 19. Insome instances, a D2S effector protein can comprise a sequence at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least98%, at least 99% or 100% identical to any one of SEQ ID NOs: 1-45,202-293, or 728-731.

TABLE 20 provides exemplary compositions comprising D2S effectorproteins, cr/sgRNAs, and tracrRNAs. Each row in TABLE 20 represents anexemplary composition. In some instances, the cr/sgRNA and/or thetracrRNA comprise a nucleobase sequence of any one of the sequences asshown in TABLE 20. In some instances, the nucleobase sequence of thecr/sgRNA is at least 65%, at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, or at least 98%, atleast 99%, or 100% identical to any one of the cr/sgRNA sequencespresent in TABLE 20. In some instances, the nucleobase sequence of thetracrRNA is at least 65%, at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, or at least 98%, atleast 99%, or 100% identical to any one of the tracrRNA sequencespresent in TABLE 20. In some instances, a D2S effector protein cancomprise a sequence at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 98%, at least 99% or 100% identical to anyone of SEQ ID NOs: 1-45, 202-293, or 728-731. In some embodiments, thecompositions are used to generate a modification of a target nucleicacid. In some embodiments, the target nucleic acid comprises a PAMsequence selected from SEQ ID NOs: 304, 312, 313, 315, 324 or 335.

TABLE 21 provides exemplary compositions comprising D2S effectorproteins, and cr/sgRNAs. Each row in TABLE 21 represents an exemplarycomposition. In some instances, the cr/sgRNA comprise a nucleobasesequence of any one of the sequences as shown in TABLE 21. In someinstances, the nucleobase sequence of the cr/sgRNA is at least 65%, atleast 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 95%, at least 97%, or at least 98%, at least 99%, or 100%identical to any one of SEQ ID NOs: 612-615. In some instances, a D2Seffector protein can comprise a sequence at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to any one of SEQ ID NOs: 232, 233, 240, or 227. In someembodiments, the compositions are used to generate a modification of atarget nucleic acid. In some embodiments, the target nucleic acidcomprises a PAM sequence selected from SEQ ID NOs: 301, 318, 335, 343,360, or 365.

TABLE 22 provides an exemplary composition comprising a D2S effectorprotein, and a cr/sgRNA. The row in TABLE 22 represents an exemplarycomposition. In some instances, the cr/sgRNA comprises a nucleobasesequence shown in TABLE 22. In some instances, the nucleobase sequenceof the sgRNA is at least 65%, at least 70%, at least 75%, at least 80%,at least 85%, at least 90%, at least 95%, at least 97%, or at least 98%,at least 99%, or 100% identical to SEQ ID NO: 616. In some instances, aD2S effector protein can comprise a sequence at least 75%, at least 80%,at least 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 228. In some embodiments, the compositionsare used to generate a modification of a target nucleic acid. In someembodiments, the target nucleic acid comprises the PAM sequence of SEQID NO: 368.

TABLE 23 provides exemplary compositions comprising D2S effectorproteins, cr/sgRNAs, and tracrRNAs. Each row in TABLE 23 represents anexemplary composition. In some instances, the cr/sgRNA and/or thetracrRNA comprise a nucleobase sequence of any one of the sequences asshown in TABLE 23. In some instances, the nucleobase sequence of thecr/sgRNA is at least 65%, at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, or at least 98%, atleast 99%, or 100% identical to of SEQ ID NOs: 617, 620 or 621. In someinstances, the nucleobase sequence of the tracrRNA is at least 65%, atleast 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 95%, at least 97%, or at least 98%, at least 99%, or 100%identical to any one of SEQ ID NOs: 618-619. In some instances, a D2Seffector protein can comprise a sequence at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99% or100% identical to SEQ ID NO: 215. In some embodiments, the compositionsare used to generate a modification of a target nucleic acid. In someembodiments, the target nucleic acid comprises the PAM sequence of SEQID NO: 343.

TABLE 24 provides exemplary compositions comprising D2S effectorproteins, cr/sgRNAs, and tracrRNAs. Each row in TABLE 24 represents anexemplary composition. In some instances, the cr/sgRNA and/or thetracrRNA comprise a nucleobase sequence of any one of the sequences asshown in TABLE 24. In some instances, the nucleobase sequence of thecr/sgRNA is at least 65%, at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, or at least 98%, atleast 99%, or 100% identical to any one of SEQ ID NOs: 68 and 149. Insome instances, the nucleobase sequence of the tracrRNA is at least 65%,at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 95%, at least 97%, or at least 98%, at least 99%, or 100%identical to SEQ ID NO: 120. In some instances, a D2S effector proteincan comprise a sequence at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, at least 98%, at least 99% or 100% identical toSEQ ID NO: 23. In some embodiments, the compositions are used togenerate a modification of a target nucleic acid. In some embodiments,the target nucleic acid comprises a PAM sequence selected from SEQ IDNOs: 325-328.

TABLE 25 provides exemplary compositions comprising D2S effectorproteins, sgRNAs, linker sequences, repeat sequences, spacer sequences,and tracrRNAs. Each row in TABLE 25 represents an exemplary composition.In some instances, the cr/sgRNA and/or the tracrRNA comprise anucleobase sequence of any one of the sequences as shown in TABLE 25. Insome instances, the linker sequence, the repeat sequence, and/or thespacer sequence comprise a nucleobase sequence of any one of thesequences as shown in TABLE 25. In some instances, the nucleobasesequence of the cr/sgRNA is at least 65%, at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, or atleast 98%, at least 99%, or 100% identical to any one ofthe cr/sgRNAsequences present in TABLE 25. In some instances, the nucleobasesequence of the tracrRNA is at least 65%, at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, or atleast 98%, at least 99%, or 100% identical to any one of the tracrRNAsequences present in TABLE 25. In some instances, the nucleobasesequence of the linker sequence is at least 65%, at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least97%, or at least 98%, at least 99%, or 100% identical to SEQ ID NO: 623.In some instances, the nucleobase sequence of the repeat sequence is atleast 65%, at least 70%, at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, at least 97%, or at least 98%, at least 99%, or100% identical to any one of the repeat sequences present in TABLE 25.In some instances, the nucleobase sequence of the spacer sequence is atleast 65%, at least 70%, at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, at least 97%, or at least 98%, at least 99%, or100% identical to any one of the spacer sequences present in TABLE 25.In some instances, a D2S effector protein can comprise a sequence atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 98%, at least 99% or 100% identical to SEQ ID NO: 23.

TABLE 26 provides exemplary spacer sequences. In some instances, thespacer sequence comprises a nucleobase sequence shown in TABLE 26. Insome instances, the nucleobase sequence of the spacer sequence is atleast 65%, at least 70%, at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, at least 97%, or at least 98%, at least 99%, or100% identical to the spacer sequence present in TABLE 26.

TABLE 28 provides exemplary spacer sequences. In some instances, thespacer sequence comprises a nucleobase sequence shown in TABLE 28. Insome instances, the nucleobase sequence of the spacer sequence is atleast 65%, at least 70%, at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, at least 97%, or at least 98%, at least 99%, or100% identical to the spacer sequence present in TABLE 28.

TABLE 34 provides exemplary compositions comprising D2S effectorproteins and sgRNAs with and without spacer sequences. Each row in TABLE34 represents an exemplary composition. In some instances, thenucleobase sequence of a guide RNA is at least 65%, at least 70%, atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 97%, or at least 98%, at least 99%, or 100% identical to any oneof the guide RNA (with or without a spacer) sequences present in TABLE34.

Effector Protein-sgRNA Complexes

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 149. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 22. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 22.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 149. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 23. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 23.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 150. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 24. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 24.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 151. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 25. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 25.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 149. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 26. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 26.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 150. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 28. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 28.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 150. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 29. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 29.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 152. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 30. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 30.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 151. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 31. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 31.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 152. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 32. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 32.

In some instances, compositions disclosed herein comprises an effectorprotein comprising an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to any one of SEQ ID NOs: 22-34; and a guide RNAcomprising a nucleobase sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 99%, or 100% identical to SEQ ID NO: 153. In some instances,effector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 99%, or 100% identical to SEQ ID NO: 33. In some instances, theamino acid sequence of the effector protein is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least99%, or 100% identical to SEQ ID NO: 33.

Pooling Guide Nucleic Acids

In some instances, compositions, systems or methods provided hereincomprise a pool of guide nucleic acids. In some instances, the pool ofguide nucleic acids were tiled against a target nucleic acid, e.g., thegenomic locus of interest or uses thereof. In some instances, a guidenucleic acid is selected from a group of guide nucleic acids that havebeen tiled against a nucleic acid sequence of a genomic locus ofinterest. The genomic locus of interest may belong to a viral genome, abacterial genome, or a mammalian genome. Non-limiting examples of viralgenomes are an HPV genome, an HIV genome, an influenza genome, or acoronavirus genome. Often, these guide nucleic acids are pooled fordetecting a target nucleic acid in a single assay. Pooling of guidenucleic acids may ensure broad spectrum identification, or broadcoverage, of a target species within a single reaction. This may beparticularly helpful in diseases or indications, like sepsis, that maybe caused by multiple organisms. The pool of guide nucleic acids mayenhance the detection of a target nucleic using systems of methodsdescribed herein relative to detection with a single guide nucleic acid.The pool of guide nucleic acids may ensure broad coverage of the targetnucleic acid within a single reaction using the methods describedherein. In some instances, the pool of guide nucleic acids arecollectively complementary to at least 50%, at least 55%, at least 60%,at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, atleast 90%, at least 95% or 100% of the target nucleic acid. In someinstances, at least a portion of the guide nucleic acids of the pooloverlap in sequence. In some instances, at least a portion of the guidenucleic acids of the pool do not overlap in sequence. In some cases, thepool of guide nucleic acids comprises at least 2, at least 3, at least4, at least 5, or at least 6 guide nucleic acids targeting differentsequences of a target nucleic acid.

Intermediary Nucleic Acids

A guide nucleic acid may comprise or be coupled to an intermediarynucleic acid. The intermediary nucleic acid may also be referred to asan intermediary RNA, although it may comprise deoxyribonucleosides inaddition to ribonucleosides. The intermediary RNA may be separate from,but forms a complex with a crRNA to form a discrete gRNA system. Theintermediary RNA may be linked to a crRNA to form a composite gRNA. AD2S effector protein may bind a crRNA and an intermediary RNA. In somecases, the crRNA and the intermediary RNA are provided as a singlenucleic acid (e.g., covalently linked). In some instances, the crRNA andthe intermediary RNA are separate polynucleotides (e.g., a discrete gRNAsystem). An intermediary RNA may comprise a repeat hybridization regionand a hairpin region. The repeat hybridization region may hybridize toall or part of the sequence of the repeat of a crRNA. The repeathybridization region may be positioned 3′ of the hairpin region. Thehairpin region may comprise a first sequence, a second sequence that isreverse complementary to the first sequence, and a stem-loop linking thefirst sequence and the second sequence.

The D2S effector protein (RNP) complex may comprise a D2S effectorprotein complexed with a guide nucleic acid (e.g., a crRNA) and anintermediary RNA. Sometimes, a guide nucleic acid comprises a crRNA andan intermediary RNA (e.g., the crRNA and intermediary RNA are providedas a single nucleic acid molecule). A composition may comprise a crRNA,an intermediary RNA, a D2S effector protein, and a detector nucleicacid.

In some instances, the length of intermediary RNAs is not greater than50, 56, 68, 71, 73, 95, or 105 linked nucleosides. In some instances,the length of an intermediary RNA is about 30 to about 120 linkednucleosides. In some instances, the length of an intermediary RNA isabout 50 to about 105, about 50 to about 95, about 50 to about 73, about50 to about 71, about 50 to about 68, or about 50 to about 56 linkednucleosides. In some instances, the length of an intermediary RNA is 56to 105 linked nucleosides, from 56 to 105 linked nucleosides, 68 to 105linked nucleosides, 71 to 105 linked nucleosides, 73 to 105 linkednucleosides, or 95 to 105 linked nucleosides. In some instances, thelength of an intermediary RNA is 40 to 60 nucleotides. In someinstances, the length of the intermediary RNA is 50, 56, 68, 71, 73, 95,or 105 linked nucleosides. In some instances, the length of theintermediary RNA is 50 nucleotides.

An exemplary intermediary RNA may comprise, from 5′ to 3′, a 5′ region,a hairpin region, a repeat hybridization region, and a 3′ region. Insome cases, the 5′ region may hybridize to the 3′ region. In someinstances, the 5′ region does not hybridize to the 3′ region. In somecases, the 3′ region is covalently linked to the crRNA (e.g., through aphosphodiester bond). In some instances, an intermediary RNA maycomprise an un-hybridized region at the 3′ end of the intermediary RNA.The un-hybridized region may have a length of about 1, about 2, about 3,about 4, about 5, about 6, about 7, about 8, about 9, about 10, about12, about 14, about 16, about 18, or about 20 linked nucleosides. Insome instances, the length of the un-hybridized region is 0 to 20 linkednucleosides.

VII. Vectors and Multiplexed Expression Vectors

In some instances, compositions and systems provided herein comprise avector system encoding a polypeptide (e.g., an effector protein)described herein. In some instances, compositions and systems providedherein comprise a vector system encoding a guide nucleic acid (e.g.,crRNA, tracrRNA, or sgRNA) described herein. In some instances,compositions and systems provided herein comprise a multi-vector systemencoding an effector protein and a guide nucleic acid described herein,wherein the guide nucleic acid and the effector protein are encoded bythe same or different vectors. In some instances, the engineered guideand the engineered effector protein are encoded by different vectors ofthe system. In some embodiments, a nucleic acid encoding a polypeptide(e.g., an effector protein) comprises an expression vector. In someembodiments, a nucleic acid encoding a polypeptide is a messenger RNA.In some embodiments, an expression vector comprises or encodes anengineered guide nucleic acid. In some cases, the expression vectorencodes the crRNA or sgRNA.

In some instances, a vector may encode one or more engineered effectorproteins. In some instances, a vector may encode 1, 2, 3, 4, 5, 6, 7, 8,9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44,or 45 engineered effector proteins. In some instances, a vector canencode one or more engineered effector proteins comprising an amino acidsequence of any one of SEQ ID NOs: 1-45. In some instances, a vector canencode one or more engineered effector proteins comprising an amino acidsequence of any one of SEQ ID NOs: 1-45, 202-293, or 728-731. In someinstances, a vector can encode one or more engineered effector proteinscomprising an amino acid sequence with at least 75%, 80%, 85%, 90%, 95%or 98% sequence identity to any one of SEQ ID NOs: 1-45. In someinstances, a vector can encode one or more engineered effector proteinscomprising an amino acid sequence with at least 75%, 80%, 85%, 90%, 95%or 98% sequence identity to any one of SEQ ID NOs: 1-45, 202-293, or728-731.

In some instances, a vector may encode one or more guide nucleic acids.In some instances, a vector may encode 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, or 45different guide nucleic acids. In some instances, a vector can encodeone or more guide nucleic acids comprising a crRNA sequence of any oneof SEQ ID NOs: 46-90. In some instances, a vector can encode one or moreguide nucleic acids comprising a crRNA sequence with at least 75%, 80%,85%, 90%, 95% or 98% sequence identity to any one of SEQ ID NOs: 46-90.In some instances, a vector can encode one or more guide nucleic acidscomprising a crRNA sequence of any one of SEQ ID NOs: 91-148. In someinstances, a vector can encode one or more guide nucleic acidscomprising a tracrRNA sequence with at least 75%, 80%, 85%, 90%, 95% or98% sequence identity to any one of SEQ ID NOs: 91-148. In someinstances, the tracrRNA and the crRNA may be linked into a single guideRNA. In some instances, a vector can encode one or more guide nucleicacids comprising a nucleobase any one of SEQ ID NOs: 149-153. In someinstances, a vector can encode one or more guide nucleic acidscomprising a guide sequence with at least 75%, 80%, 85%, 90%, 95% or 98%sequence identity to any one of SEQ ID NOs: 149-153.

Lipid Particles

In some instances, compositions and systems provided herein comprise alipid particle. In some embodiments, a lipid particle is a lipidnanoparticle (LNP). In some embodiments, a lipid or a lipid nanoparticlecan encapsulate an expression vector. In some embodiments, a lipid or alipid nanoparticle can encapsulate the D2S effector protein, the sgRNAor crRNA, the nucleic acid encoding the D2S effector protein and/or theDNA molecule encoding the sgRNA or crRNA. LNPs are a non-viral deliverysystem for gene therapy. LNPs are effective for delivery of nucleicacids. Beneficial properties of LNP include ease of manufacture, lowcytotoxicity and immunogenicity, high efficiency of nucleic acidencapsulation and cell transfection, multi-dosing capabilities andflexibility of design (Kulkarni et al., (2018) Nucleic AcidTherapeutics, 28(3):146-157). In some cases, a method can comprisecontacting a cell with an expression vector. In some cases, contactingcan comprise electroporation, lipofection, or lipid nanoparticle (LNP)delivery of an expression vector.

Viral Vectors

An expression vector can be a viral vector. In some embodiments, a viralvector comprises a nucleic acid to be delivered into a host cell via arecombinantly produced virus or viral particle. The nucleic acid may besingle-stranded or double stranded, linear or circular, segmented ornon-segmented. The nucleic acid may comprise DNA, RNA, or a combinationthereof. In some embodiments, the expression vector is anadeno-associated viral vector. There are a variety of viral vectors thatare associated with various types of viruses, including but not limitedto retroviruses (e.g., lentiviruses and γ-retroviruses), adenoviruses,arenaviruses, alphaviruses, adeno-associated viruses (AAVs),baculoviruses, vaccinia viruses, herpes simplex viruses and poxviruses.A viral vector provided herein can be derived from or based on any suchvirus. Often the viral vectors provided herein are an adeno-associatedviral vector (AAV vector). Generally, an AAV vector has two invertedterminal repeats (ITRs). According, in some embodiments, the viralvector provided herein comprises two inverted terminal repeats of AAV.The DNA sequence in between the ITRs of an AAV vector provided hereinmay be referred to herein as the sequence encoding the genome editingtools. These genome editing tools can include, but are not limited to,an effector protein, effector protein modifications (e.g., nuclearlocalization signal (NLS), polyA tail), guide nucleic acid(s),respective promoter(s), and a donor nucleic acid, or combinationsthereof. In some embodiments, a nuclear localization signal comprises anentity (e.g., peptide) that facilitates localization of a nucleic acid,protein, or small molecule to the nucleus, when present in a cell thatcontains a nuclear compartment.

In general, viral vectors provided herein comprise at least one promotoror a combination of promoters driving expression or transcription of oneor more genome editing tools described herein. In some embodiments, thelength of the promoter is less than about 500, less than about 400, orless than about 300 linked nucleotides. In some embodiments, the lengthof the promoter is at least 100 linked nucleotides. Non-limitingexamples of promoters include CMV, EF1a, RPBSA, hPGK, EFS, SV40, PGK1,Ubc, human beta actin promoter, CAG, TRE, UAS, Ac5, Polyhedrin, CaMKIIa,GAL1, H1, TEF1, GDS, ADH1, CaMV35S, Ubi, U6, MNDU3, and MSCV. In someembodiments, the promoter is an inducible promoter that only drivesexpression of its corresponding gene when a signal is present, e.g., ahormone, a small molecule, a peptide. Non-limiting examples of induciblepromoters are the T7 RNA polymerase promoter, the T3 RNA polymerasepromoter, the Isopropyl-beta-D-thiogalactopyranoside (IPTG)-regulatedpromoter, a lactose induced promoter, a heat shock promoter, atetracycline-regulated promoter (tetracycline-inducible ortetracycline-repressible), a steroid regulated promoter, ametal-regulated promoter, and an estrogen receptor-regulated promoter.In some embodiments, the promoter is an activation-inducible promoter,such as a CD69 promoter, as described further in Kulemzin et al.,(2019), BMC Med Genomics, 12:44.

In some embodiments, the coding region of the AAV vector forms anintramolecular double-stranded DNA template thereby generating an AAVvector that is a self-complementary AAV (scAAV) vector. In general, thesequence encoding the genome editing tools of an scAAV vector has alength of about 2 kb to about 3 kb. The scAAV vector can comprisenucleotide sequences encoding an effector protein, providing guidenucleic acids described herein, and a donor nucleic acid describedherein. In some embodiments, the AAV vector provided herein is aself-inactivating AAV vector.

In some embodiments, an AAV vector provided herein comprises amodification, such as an insertion, deletion, chemical alteration, orsynthetic modification, relative to a wild-type AAV vector.

In some embodiments, the viral particle that delivers the viral vectordescribed herein is an AAV. AAVs are characterized by their serotype.Non-limiting examples of AAV serotypes are AAV1, AAV2, AAV3, AAV4, AAV5,AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, scAAV, AAV-rh10, chimericor hybrid AAV, or any combination, derivative, or variant thereof.

Producing AAV Particles

The AAV particles described herein can be referred to as recombinant AAV(rAAV). Often, rAAV particles are generated by transfecting AAVproducing cells with an AAV-containing plasmid carrying the sequenceencoding the genome editing tools, a plasmid that carries viral encodingregions, i.e., Rep and Cap gene regions; and a plasmid that provides thehelper genes such as E1A, E1B, E2A, E4ORF6 and VA. In some embodiments,the AAV producing cells are mammalian cells. In some embodiments, hostcells for rAAV viral particle production are mammalian cells. In someembodiments, a mammalian cell for rAAV viral particle production is aCOS cell, a HEK293T cell, a HeLa cell, a KB cell, a derivative thereof,or a combination thereof. In some embodiments, rAAV virus particles canbe produced in the mammalian cell culture system by providing the rAAVplasmid to the mammalian cell. In some embodiments, producing rAAV virusparticles in a mammalian cell can comprise transfecting vectors thatexpress the rep protein, the capsid protein, and the gene-of-interestexpression construct flanked by the ITR sequence on the 5′ and 3′ ends.Methods of such processes are provided in, for example, Naso et al.,BioDrugs, 2017 August; 31(4):317-334 and Benskey et al., (2019), MethodsMol Biol., 1937:3-26, each of which is incorporated by reference intheir entireties.

In some embodiments, rAAV is produced in a non-mammalian cell. In someembodiments, rAAV is produced in an insect cell. In some embodiments, aninsect cell for producing rAAV viral particles comprises a Sf9 cell. Insome embodiments, production of rAAV virus particles in insect cells cancomprise baculovirus. In some embodiments, production of rAAV virusparticles in insect cells can comprise infecting the insect cells withthree recombinant baculoviruses, one carrying the cap gene, one carryingthe rep gene, and one carrying the gene-of-interest expression constructenclosed by an ITR on both the 5′ and 3′ end. In some embodiments, rAAVvirus particles are produced by the One Bac system. In some embodiments,rAAV virus particles can be produced by the Two Bac system. In someembodiments, in the Two Bac system, the rep gene and the cap gene of theAAV is integrated into one baculovirus virus genome, and the ITRsequence and the gene-of-interest expression construct is integratedinto another baculovirus virus genome. In some embodiments, in the OneBac system, an insect cell line that expresses both the rep protein andthe capsid protein is established and infected with a baculovirus virusintegrated with the ITR sequence and the gene-of-interest expressionconstruct. Details of such processes are provided in, for example, Smithet. al., (1983), Mol. Cell. Biol., 3(12):2156-65; Urabe et al., (2002),Hum. Gene. Ther., 1; 13(16):1935-43; and Benskey et al., (2019), MethodsMol Biol., 1937:3-26, each of which is incorporated by reference in itsentirety.

VIII. Modifications

Polypeptides (e.g., effector proteins) and nucleic acids (e.g.,engineered guide nucleic acids) described herein can be further modifiedas described throughout and as further described herein.

Examples are modifications of interest that do not alter primarysequence, including chemical derivatization of polypeptides, e.g.,acylation, acetylation, carboxylation, amidation, etc. Also included aremodifications of glycosylation, e.g. those made by modifying theglycosylation patterns of a polypeptide during its synthesis andprocessing or in further processing steps; e.g. by exposing thepolypeptide to enzymes which affect glycosylation, such as mammalianglycosylating or deglycosylating enzymes. Also embraced are sequencesthat have phosphorylated amino acid residues, e.g. phosphotyrosine,phosphoserine, or phosphothreonine.

Modifications disclosed herein can also include modification ofdescribed polypeptides and/or engineered guide nucleic acids through anysuitable method, such as molecular biological techniques and/orsynthetic chemistry, to improve their resistance to proteolyticdegradation, to change the target sequence specificity, to optimizesolubility properties, to alter protein activity (e.g., transcriptionmodulatory activity, enzymatic activity, etc.) or to render them moresuitable. Analogs of such polypeptides include those containing residuesother than naturally occurring L-amino acids, e.g. D-amino acids ornon-naturally occurring synthetic amino acids. D-amino acids may besubstituted for some or all of the amino acid residues. Modificationscan also include modifications with non-naturally occurring unnaturalamino acids. The particular sequence and the manner of preparation willbe determined by convenience, economics, purity required, and the like.

Modifications can further include the introduction of various groups topolypeptides and/or engineered guide nucleic acids described herein. Forexample, groups can be introduced during synthesis or during expressionof a polypeptide (e.g., a effector protein), which allow for linking toother molecules or to a surface. Thus, e.g., cysteines can be used tomake thioethers, histidines for linking to a metal ion complex, carboxylgroups for forming amides or esters, amino groups for forming amides,and the like.

Modifications can further include modification of nucleic acidsdescribed herein (e.g., engineered guide nucleic acids) to provide thenucleic acid with a new or enhanced feature, such as improved stability.Such modifications of a nucleic acid include a base modification, abackbone modification, a sugar modification, or combinations thereof, ofone or more nucleotides, nucleosides, or nucleobases in a nucleic acid.

In some embodiments, nucleic acids (e.g., engineered guide nucleicacids) described herein comprise one or more modifications comprising:2′O-methyl modified nucleotides, 2′ Fluoro modified nucleotides; lockednucleic acid (LNA) modified nucleotides; peptide nucleic acid (PNA)modified nucleotides; nucleotides with phosphorothioate linkages; a 5′cap (e.g., a 7-methylguanylate cap (m7G)), phosphorothioates, chiralphosphorothioates, phosphorodithioates, phosphotriesters,aminoalkylphosphotriesters, methyl and other alkyl phosphonatesincluding 3′-alkylene phosphonates, 5′-alkylene phosphonates and chiralphosphonates, phosphinates, phosphoramidates including 3′-aminophosphoramidate and aminoalkyl phosphoramidates, phosphorodiamidates,thionophosphor amidates, thionoalkylphosphonates,thionoalkylphosphotriesters, selenophosphates and boranophosphateshaving normal 3′-5′ linkages, 2′-5′ linked analogs of these, and thosehaving inverted polarity wherein one or more internucleotide linkages isa 3′ to 3′, 5′ to 5′ or 2′ to 2′ linkage; phosphorothioate and/orheteroatom internucleoside linkages, such as —CH2-NH—O—CH2-,—CH2-N(CH3)-O—CH2- (known as a methylene (methylimino) or MMI backbone),—CH2-O—N(CH3)-CH2-, —CH2-N(CH3)-N(CH3)-CH2- and —O—N(CH3)-CH2-CH2-(wherein the native phosphodiester internucleotide linkage isrepresented as —O—P(═O)(OH)—O—CH2-); morpholino linkages (formed in partfrom the sugar portion of a nucleoside); morpholino backbones;phosphorodiamidate or other non-phosphodiester internucleoside linkages;siloxane backbones; sulfide, sulfoxide and sulfone backbones; formacetyland thioformacetyl backbones; methylene formacetyl and thioformacetylbackbones; riboacetyl backbones; alkene containing backbones; sulfamatebackbones; methyleneimino and methylenehydrazino backbones; sulfonateand sulfonamide backbones; amide backbones; other backbone modificationshaving mixed N, O, S and CH2 component parts; and combinations thereof.

IX. Systems

Disclosed herein, in some aspects, are systems for modifying a nucleicacid, comprising any one of the D2S effector proteins described herein,or a multimeric complex thereof. Systems may have components that can beused to detect, modify, or edit a target nucleic acid, wherein suchcomponents include, separately or in combination as a composition, a D2Seffector protein, a guide nucleic acid, or other reagent or moleculedescribed herein. Systems may be used to modify the activity orexpression of a target nucleic acid. In some instances, systems comprisea D2S effector protein described herein, a reagent, support medium, or acombination thereof. In some instances, the D2S effector proteincomprises a D2S effector protein, or a fusion protein thereof, describedherein. In some instances, the D2S effector protein comprises an aminoacid sequence that is at least 70%, at least 75%, at least 80%, at least85%, at least 90%, at least 95%, or 100% identical to any one of SEQ IDNOs: 1-45. In some instances, the D2S effector protein comprises anamino acid sequence that is at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, or 100% identical to any one ofSEQ ID NOs: 1-45, 202-293, or 728-731. In some instances, the amino acidsequence of the D2S effector protein is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto any one of SEQ ID NOs: 1-45. In some instances, the amino acidsequence of the D2S effector protein is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100% identicalto any one of SEQ ID NOs: 1-45, 202-293, or 728-731. Such systems may beused for detecting the presence of a target nucleic acid associated withor causative of a disease, such as cancer, a genetic disorder, or aninfection. In some instances, such methods and systems are useful forphenotyping, genotyping, or determining ancestry. Unless specifiedotherwise, systems include kits and may be referred to as kits. Unlessspecified otherwise, systems include devices and may also be referred toas devices. Systems described herein may be provided in the form of acompanion diagnostic assay or device, a point-of-care assay or device,or an over-the-counter diagnostic assay/device.

Systems described herein, in some aspects, are for detecting ormodifying a target sequence of a target nucleic acid comprising: a) apolypeptide (e.g., an effector protein) described herein, or a nucleicacid encoding the polypeptide; and b) an engineered guide nucleic acid.In some cases, the polypeptide comprises an amino acid sequence that isat least 85%, at least 90%, at least 95%, or at least 100% identical toSEQ ID NO: 23. In some cases, the engineered guide nucleic acidcomprises a sequence that is at least 60%, at least 65%, at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, orat least 100% identical to 624, 628, 630, 634, 638, 641, 643, and 645.In some cases, the target nucleic acid comprises a protospacer adjacentmotif (PAM) selected from SEQ ID NOS: 156-159, 325-328, or 343. In someembodiments, the PAM is required for the polypeptide and engineeredguide nucleic acid to detect or modify the target sequence. In somecases, the polypeptide comprises a mutation that reduces a catalyticactivity of the polypeptide relative to the polypeptide that is 100%identical to SEQ ID NO: 23. In some cases, the polypeptide is capable ofbinding to the target nucleic acid but has reduced or no nucleaseactivity on the target nucleic acid. In some cases, the polypeptide is anuclease that is capable of cleaving at least one strand of a targetnucleic acid. In some cases, the system comprises a fusion partnerprotein fused to the polypeptide. In some cases, the system comprises atleast one of a detection reagent and an amplification reagent. In somecases, the detection reagent is selected from: a reporter nucleic acid,a detection moiety, an additional polypeptide, and a combinationthereof. In some cases, the at least one amplification reagent isselected from: the group consisting of a primer, an polymerase, a dNTP,an rNTP, and combinations thereof. In some cases, the target nucleicacid comprises a protospacer adjacent motif (PAM) selected from any oneof SEQ ID NOS: 156-159, 325-328, and 369, and the PAM is required forthe polypeptide and engineered guide nucleic acid to detect or modifythe target sequence. In some cases, the target nucleic acid comprises aPAM sequence of SEQ ID NO: 369. Also described herein are compositionscomprising a polypeptide and an engineered guide nucleic acid. In someembodiments, the polypeptide comprises an amino acid sequence that is atleast at least 80%, at least 85%, at least 90%, at least 95%, or atleast 100% identical to SEQ ID NO: 23. In some embodiments, theengineered guide nucleic acid comprises a sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, orat least 100% identical to a sequence selected from: SEQ ID NOS: 624,628, 630, 634, 638, 641, 643, and 645. In some embodiments, thepolypeptide is fused to at least one nuclear localization signal. Insome cases, the polypeptide is capable of binding to the target nucleicacid but has reduced or no nuclease activity on the target nucleic acid.In some cases, the polypeptide is a nuclease that is capable of cleavingat least one strand of a target nucleic acid. In some cases, the systemcomprises a fusion partner protein fused to the polypeptide. In somecases, the length of the polypeptide is about 450 to about 550, about400 to about 600, or about 450 to about 500 linked amino acids. In somecases, the composition comprises a recombinase. In some cases, thecomposition further comprises a target nucleic acid, and wherein thetarget nucleic acid comprises a PAM sequence selected from any one ofSEQ ID NOs: 156-159, 325-328, and 369. In some cases, the compositioncomprises a donor nucleic acid.

Reagents and effector proteins of various systems may be provided in areagent chamber or on the support medium. Alternatively, the reagentand/or effector protein may be contacted with the reagent chamber or thesupport medium by the individual using the system. An exemplary reagentchamber is a test well or container. The opening of the reagent chambermay be large enough to accommodate the support medium. Optionally, thesystem comprises a buffer and a dropper. The buffer may be provided in adropper bottle for ease of dispensing. The dropper may be disposable andtransfer a fixed volume. The dropper may be used to place a sample intothe reagent chamber or on the support medium.

System Solutions

In general, systems comprise a solution in which the activity of aneffector protein occurs. Often, the solution comprises or consistsessentially of a buffer. The solution or buffer may comprise a bufferingagent, a salt, a crowding agent, a detergent, a reducing agent, acompetitor, or a combination thereof. Often the buffer is the primarycomponent or the basis for the solution in which the activity occurs.Thus, concentrations for components of buffers described herein (e.g.,buffering agents, salts, crowding agents, detergents, reducing agents,and competitors) are the same or essentially the same as theconcentration of these components in the solution in which the activityoccurs. In some instances, a buffer is required for cell lysis activityor viral lysis activity.

In some instances, systems comprise a buffer, wherein the buffercomprise at least one buffering agent. Exemplary buffering agentsinclude HEPES, TRIS, MES, ADA, PIPES, ACES, MOPSO, BIS-TRIS propane,BES, MOPS, TES, DISO, Trizma, TRICINE, GLY-GLY, HEPPS, BICINE, TAPS, AMPD, A MPSO, CHES, CAPSO, AMP, CAPS, phosphate, citrate, acetate,imidazole, or any combination thereof. In some instances, theconcentration of the buffering agent in the buffer is 1 mM to 200 mM. Abuffer compatible with an effector protein may comprise a bufferingagent at a concentration of 10 mM to 30 mM. A buffer compatible with aneffector protein may comprise a buffering agent at a concentration ofabout 20 mM. A buffering agent may provide a pH for the buffer or thesolution in which the activity of the effector protein occurs. The pHmay be 3 to 4, 3.5 to 4.5, 4 to 5, 4.5 to 5.5, 5 to 6, 5.5 to 6.5, 6 to7, 6.5 to 7.5, 7 to 8, 7.5 to 8.5, 8 to 9, 8.5 to 9.5, 9 to 10, or 9.5to 10.5.

In some instances, systems comprise a solution, wherein the solutioncomprises at least one salt. In some instances, the at least one salt isselected from potassium acetate, magnesium acetate, sodium chloride,potassium chloride, magnesium chloride, calcium chloride, and anycombination thereof. In some instances, the concentration of the atleast one salt in the solution is 5 mM to 100 mM, 5 mM to 10 mM, 1 mM to60 mM, or 1 mM to 10 mM. In some instances, the concentration of the atleast one salt is about 105 mM. In some instances, the concentration ofthe at least one salt is about 55 mM. In some instances, theconcentration of the at least one salt is about 7 mM. In some instances,the solution comprises potassium acetate and magnesium acetate. In someinstances, the solution comprises sodium chloride and magnesiumchloride. In some instances, the solution comprises potassium chlorideand magnesium chloride. In some instances, the salt is a magnesium saltand the concentration of magnesium in the solution is at least 5 mM, 7mM, at least 9 mM, at least 11 mM, at least 13 mM, or at least 15 mM. Insome instances, the concentration of magnesium is less than 20 mM, lessthan 18 mM or less than 16 mM.

In some instances, systems comprise a solution, wherein the solutioncomprises at least one crowding agent. A crowding agent may reduce thevolume of solvent available for other molecules in the solution, therebyincreasing the effective concentrations of said molecules. Exemplarycrowding agents include glycerol and bovine serum albumin. In someinstances, the crowding agent is glycerol. In some instances, theconcentration of the crowding agent in the solution is 0.01% (v/v) to10% (v/v). In some instances, the concentration of the crowding agent inthe solution is 0.5% (v/v) to 10% (v/v).

In some instances, systems comprise a solution, wherein the solutioncomprises at least one detergent. Exemplary detergents include Tween,Triton-X, and IGEPAL. A solution may comprise Tween, Triton-X, or anycombination thereof. A solution may comprise Triton-X. A solution maycomprise IGEPAL CA-630. In some instances, the concentration of thedetergent in the solution is 2% (v/v) or less. In some instances, theconcentration of the detergent in the solution is 1% (v/v) or less. Insome instances, the concentration of the detergent in the solution is0.00001% (v/v) to 0.01% (v/v). In some instances, the concentration ofthe detergent in the solution is about 0.01% (v/v).

In some instances, systems comprise a solution, wherein the solutioncomprises at least one reducing agent. Exemplary reducing agentscomprise dithiothreitol (DTT), β-mercaptoethanol (BME), ortris(2-carboxyethyl) phosphine (TCEP). In some instances, the reducingagent is DTT. In some instances, the concentration of the reducing agentin the solution is 0.01 mM to 100 mM. In some instances, theconcentration of the reducing agent in the solution is 0.1 mM to 10 mM.In some instances, the concentration of the reducing agent in thesolution is 0.5 mM to 2 mM. In some instances, the concentration of thereducing agent in the solution is 0.01 mM to 100 mM. In some instances,the concentration of the reducing agent in the solution is 0.1 mM to 10mM. In some instances, the concentration of the reducing agent in thesolution is about 1 mM.

In some instances, systems comprise a solution, wherein the solutioncomprise a competitor. In general, competitors compete with the targetnucleic acid or the reporter nucleic acid for cleavage by the effectorprotein or a dimer thereof. Exemplary competitors include heparin, andimidazole, and salmon sperm DNA. In some instances, the concentration ofthe competitor in the solution is 1 μg/mL to 100 μg/mL. In someinstances, the concentration of the competitor in the solution is 40μg/mL to 60 μg/mL.

In some instances, systems comprise a solution, wherein the solutioncomprise a co-factor. In some instances, the co-factor allows aneffector protein or a multimeric complex thereof to perform a function,including pre-crRNA processing and/or target nucleic acid cleavage. Thesuitability of a cofactor for an effector protein or a multimericcomplex thereof may be assessed, such as by methods based on thosedescribed by Sundaresan et al. (Cell Rep. 2017 Dec. 26; 21(13):3728-3739). In some instances, an effector or a multimeric complexthereof forms a complex with a co-factor. In some instances, theco-factor is a divalent metal ion. In some instances, the divalent metalion is selected from Mg²⁺, Mn²⁺, Zn²⁺, Ca²⁺, Cu²⁺. In some instances,the divalent metal ion is Mg²⁺. In some instances, the effector proteinis a D2S effector protein and the co-factor is Mg²⁺.

Reporters

In some embodiments, systems disclosed herein comprise a detectionreagent and an amplification reagent. In some instances, a detectionreagent comprises a reporter. In some embodiments, reporter and areporter nucleic acid comprise a non-target nucleic acid molecule thatcan provide a detectable signal upon cleavage by an effector protein. Insome instances, a detection reagent comprises an additional polypeptide.In some instances, a detection reagent comprises a detection moiety. Insome instances, systems disclosed herein comprise a reporter. By way ofnon-limiting and illustrative example, a reporter may comprise a singlestranded nucleic acid and a detection moiety (e.g., a labeled singlestranded RNA reporter), wherein the nucleic acid is capable of beingcleaved by an effector protein (e.g., a D2S CRISPR/Cas protein asdisclosed herein) or a multimeric complex thereof, releasing thedetection moiety, and, generating a detectable signal. As used herein,“reporter” is used interchangeably with “reporter nucleic acid” or“reporter molecule”. The effector proteins disclosed herein, activatedupon hybridization of a guide RNA to a target nucleic acid, may cleavethe reporter. Cleaving the “reporter” may be referred to herein ascleaving the “reporter nucleic acid,” the “reporter molecule,” or the“nucleic acid of the reporter.” Reporters may comprise RNA. Reportersmay comprise DNA. Reporters may be double-stranded. Reporters may besingle-stranded.

In some instances, reporters comprise a protein capable of generating asignal. A signal may be a calorimetric, potentiometric, amperometric,optical (e.g., fluorescent, colorimetric, etc.), or piezo-electricsignal. In some instances, the reporter comprises a detection moiety.Suitable detectable labels and/or moieties that may provide a signalinclude, but are not limited to, an enzyme, a radioisotope, a member ofa specific binding pair; a fluorophore; a fluorescent protein; a quantumdot; and the like.

In some instances, the reporter comprises a detection moiety and aquenching moiety. In some instances, the reporter comprises a cleavagesite, wherein the detection moiety is located at a first site on thereporter and the quenching moiety is located at a second site on thereporter, wherein the first site and the second site are separated bythe cleavage site. Sometimes the quenching moiety is a fluorescencequenching moiety. In some instances, the quenching moiety is 5′ to thecleavage site and the detection moiety is 3′ to the cleavage site. Insome instances, the detection moiety is 5′ to the cleavage site and thequenching moiety is 3′ to the cleavage site. Sometimes the quenchingmoiety is at the 5′ terminus of the nucleic acid of a reporter.Sometimes the detection moiety is at the 3′ terminus of the nucleic acidof a reporter. In some instances, the detection moiety is at the 5′terminus of the nucleic acid of a reporter. In some instances, thequenching moiety is at the 3′ terminus of the nucleic acid of areporter.

Suitable fluorescent proteins include, but are not limited to, greenfluorescent protein (GFP) or variants thereof, blue fluorescent variantof GFP (BFP), cyan fluorescent variant of GFP (CFP), yellow fluorescentvariant of GFP (YFP), enhanced GFP (EGFP), enhanced CFP (ECFP), enhancedYFP (EYFP), GFPS65T, Emerald, Topaz (TYFP), Venus, Citrine, mCitrine,GFPuv, destabilised EGFP (dEGFP), destabilised ECFP (dECFP),destabilised EYFP (dEYFP), mCFPm, Cerulean, T-Sapphire, CyPet, YPet,mKO, HcRed, t-HcRed, DsRed, DsRed2, DsRed-monomer, J-Red, dimer2,t-dimer2(12), mRFP1, pocilloporin, Renilla GFP, Monster GFP, paGFP,Kaede protein and kindling protein, Phycobiliproteins andPhycobiliprotein conjugates including B-Phycoerythrin, R-Phycoerythrinand Allophycocyanin. Suitable enzymes include, but are not limited to,horse radish peroxidase (HRP), alkaline phosphatase (AP),beta-galactosidase (GAL), glucose-6-phosphate dehydrogenase,beta-N-acetylglucosaminidase, β-glucuronidase, invertase, XanthineOxidase, firefly luciferase, and glucose oxidase (GO).

In some instances, the detection moiety comprises an polypeptide. Insome instances, the detection moiety comprises an invertase. Thesubstrate of the invertase may be sucrose. A DNS reagent may be includedin the system to produce a colorimetric change when the invertaseconverts sucrose to glucose. In some instances, the reporter nucleicacid and invertase are conjugated using a heterobifunctional linker viasulfo-SMCC chemistry.

Suitable fluorophores may provide a detectable fluorescence signal inthe same range as 6-Fluorescein (Integrated DNA Technologies), IRDye 700(Integrated DNA Technologies), TYE 665 (Integrated DNA Technologies),Alex Fluor 594 (Integrated DNA Technologies), or ATTO TM 633 (NHS Ester)(Integrated DNA Technologies). Non-limiting examples of fluorophores arefluorescein amidite, 6-Fluorescein, IRDye 700, TYE 665, Alex Fluor 594,or ATTO TM 633 (NHS Ester). The fluorophore may be an infraredfluorophore. The fluorophore may emit fluorescence in the range of 500nm and 720 nm. In some instances, the fluorophore emits fluorescence ata wavelength of 700 nm or higher. In other cases, the fluorophore emitsfluorescence at about 665 nm. In some instances, the fluorophore emitsfluorescence in the range of 500 nm to 520 nm, 500 nm to 540 nm, 500 nmto 590 nm, 590 nm to 600 nm, 600 nm to 610 nm, 610 nm to 620 nm, 620 nmto 630 nm, 630 nm to 640 nm, 640 nm to 650 nm, 650 nm to 660 nm, 660 nmto 670 nm, 670 nm to 680 nm, 690 nm to 690 nm, 690 nm to 700 nm, 700 nmto 710 nm, 710 nm to 720 nm, or 720 nm to 730 nm. In some instances, thefluorophore emits fluorescence in the range 450 nm to 750 nm, 500 nm to650 nm, or 550 to 650 nm.

Systems may comprise a quenching moiety. A quenching moiety may bechosen based on its ability to quench the detection moiety. A quenchingmoiety may be a non-fluorescent fluorescence quencher. A quenchingmoiety may quench a detection moiety that emits fluorescence in therange of 500 nm and 720 nm. A quenching moiety may quench a detectionmoiety that emits fluorescence in the range of 500 nm and 720 nm. Insome instances, the quenching moiety quenches a detection moiety thatemits fluorescence at a wavelength of 700 nm or higher. In other cases,the quenching moiety quenches a detection moiety that emits fluorescenceat about 660 nm or about 670 nm. In some instances, the quenching moietyquenches a detection moiety that emits fluorescence in the range of 500to 520, 500 to 540, 500 to 590, 590 to 600, 600 to 610, 610 to 620, 620to 630, 630 to 640, 640 to 650, 650 to 660, 660 to 670, 670 to 680, 690to 690, 690 to 700, 700 to 710, 710 to 720, or 720 to 730 nm. In someinstances, the quenching moiety quenches a detection moiety that emitsfluorescence in the range 450 nm to 750 nm, 500 nm to 650 nm, or 550 to650 nm. A quenching moiety may quench fluorescein amidite,6-Fluorescein, IRDye 700, TYE 665, Alex Fluor 594, or ATTO TM 633 (NHSEster). A quenching moiety may be Iowa Black RQ, Iowa Black FQ or IRDyeQC-1 Quencher. A quenching moiety may quench fluorescein amidite,6-Fluorescein (Integrated DNA Technologies), IRDye 700 (Integrated DNATechnologies), TYE 665 (Integrated DNA Technologies), Alex Fluor 594(Integrated DNA Technologies), or ATTO TM 633 (NHS Ester) (IntegratedDNA Technologies). A quenching moiety may be Iowa Black RQ (IntegratedDNA Technologies), Iowa Black FQ (Integrated DNA Technologies) or IRDyeQC-1 Quencher (LiCor). Any of the quenching moieties described hereinmay be from any commercially available source, may be an alternativewith a similar function, a generic, or a non-tradename of the quenchingmoieties listed.

The generation of the detectable signal from the release of thedetection moiety indicates that cleavage by the effector protein hasoccurred and that the sample contains the target nucleic acid. In someinstances, the detection moiety comprises a fluorescent dye. Sometimesthe detection moiety comprises a fluorescence resonance energy transfer(FRET) pair. In some instances, the detection moiety comprises aninfrared (IR) dye. In some instances, the detection moiety comprises anultraviolet (UV) dye. Alternatively, or in combination, the detectionmoiety comprises a protein. Sometimes the detection moiety comprises abiotin. Sometimes the detection moiety comprises at least one of avidinor streptavidin. In some instances, the detection moiety comprises apolysaccharide, a polymer, or a nanoparticle. In some instances, thedetection moiety comprises a gold nanoparticle or a latex nanoparticle.

A detection moiety may be any moiety capable of generating acalorimetric, potentiometric, amperometric, optical (e.g., fluorescent,colorimetric, etc.), or piezo-electric signal. A nucleic acid of areporter, sometimes, is protein-nucleic acid that is capable ofgenerating a calorimetric, potentiometric, amperometric, optical (e.g.,fluorescent, colorimetric, etc.), or piezo-electric signal upon cleavageof the nucleic acid. Often a calorimetric signal is heat produced aftercleavage of the nucleic acids of a reporter. Sometimes, a calorimetricsignal is heat absorbed after cleavage of the nucleic acids of areporter. A potentiometric signal, for example, is electrical potentialproduced after cleavage of the nucleic acids of a reporter. Anamperometric signal may be movement of electrons produced after thecleavage of nucleic acid of a reporter. Often, the signal is an opticalsignal, such as a colorimetric signal or a fluorescence signal. Anoptical signal is, for example, a light output produced after thecleavage of the nucleic acids of a reporter. Sometimes, an opticalsignal is a change in light absorbance between before and after thecleavage of nucleic acids of a reporter. Often, a piezo-electric signalis a change in mass between before and after the cleavage of the nucleicacid of a reporter.

The detectable signal may be a colorimetric signal or a signal visibleby eye. In some instances, the detectable signal may be fluorescent,electrical, chemical, electrochemical, or magnetic. In some instances,the first detection signal may be generated by binding of the detectionmoiety to the capture molecule in the detection region, where the firstdetection signal indicates that the sample contained the target nucleicacid. Sometimes systems are capable of detecting more than one type oftarget nucleic acid, wherein the system comprises more than one type ofguide nucleic acid and more than one type of reporter nucleic acid. Insome instances, the detectable signal may be generated directly by thecleavage event. Alternatively, or in combination, the detectable signalmay be generated indirectly by the signal event. Sometimes thedetectable signal is not a fluorescent signal. In some instances, thedetectable signal may be a colorimetric or color-based signal. In someinstances, the detected target nucleic acid may be identified based onits spatial location on the detection region of the support medium. Insome instances, the second detectable signal may be generated in aspatially distinct location than the first generated signal.

In some instances, the reporter nucleic acid is a single-strandednucleic acid sequence comprising ribonucleotides. The nucleic acid of areporter may be a single-stranded nucleic acid sequence comprising atleast one ribonucleotide. In some instances, the nucleic acid of areporter is a single-stranded nucleic acid comprising at least oneribonucleotide residue at an internal position that functions as acleavage site. In some instances, the nucleic acid of a reportercomprises at least 2, at least 3, at least 4, at least 5, at least 6, atleast 7, at least 8, at least 9, or at least 10 ribonucleotide residuesat an internal position. In some instances, the nucleic acid of areporter comprises from 2 to 10, from 3 to 9, from 4 to 8, or from 5 to7 ribonucleotide residues at an internal position. Sometimes theribonucleotide residues are continuous. Alternatively, theribonucleotide residues are interspersed in between non-ribonucleotideresidues. In some instances, the nucleic acid of a reporter has onlyribonucleotide residues. In some instances, the nucleic acid of areporter has only deoxyribonucleotide residues. In some instances, thenucleic acid comprises nucleotides resistant to cleavage by the effectorprotein described herein. In some instances, the nucleic acid of areporter comprises synthetic nucleotides. In some instances, the nucleicacid of a reporter comprises at least one ribonucleotide residue and atleast one non-ribonucleotide residue.

In some instances, the nucleic acid of a reporter comprises at least oneuracil ribonucleotide. In some instances, the nucleic acid of a reportercomprises at least two uracil ribonucleotides. Sometimes the nucleicacid of a reporter has only uracil ribonucleotides. In some instances,the nucleic acid of a reporter comprises at least one adenineribonucleotide. In some instances, the nucleic acid of a reportercomprises at least two adenine ribonucleotide. In some instances, thenucleic acid of a reporter has only adenine ribonucleotides. In someinstances, the nucleic acid of a reporter comprises at least onecytosine ribonucleotide. In some instances, the nucleic acid of areporter comprises at least two cytosine ribonucleotide. In someinstances, the nucleic acid of a reporter comprises at least one guanineribonucleotide. In some instances, the nucleic acid of a reportercomprises at least two guanine ribonucleotide. In some instances, anucleic acid of a reporter comprises a single unmodified ribonucleotide.In some instances, a nucleic acid of a reporter comprises onlyunmodified deoxyribonucleotides.

In some instances, the nucleic acid of a reporter is 5 to 20, 5 to 15, 5to 10, 7 to 20, 7 to 15, or 7 to 10 nucleotides in length. In someinstances, the nucleic acid of a reporter is 3 to 20, 4 to 10, 5 to 10,or 5 to 8 nucleotides in length. In some instances, the nucleic acid ofa reporter is 5 to 12 nucleotides in length. In some instances, thereporter nucleic acid is at least 2, at least 3, at least 4, at least 5,at least 6, at least 7, at least 8, at least 9, at least 10, at least11, at least 12, at least 13, at least 14, at least 15, at least 16, atleast 17, at least 18, at least 19, at least 20, at least 21, at least22, at least 23, at least 24, at least 25, at least 26, at least 27, atleast 28, at least 29, or at least 30 nucleotides in length. In someinstances, the reporter nucleic acid is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29,or 30 nucleotides in length.

In some instances, systems comprise a plurality of reporters. Theplurality of reporters may comprise a plurality of signals. In someinstances, systems comprise at least 2, at least 3, at least 4, at least5, at least 6, at least 7, at least 8, at least 9, at least 10, at least11, at least 12, at least 13, at least 14, at least 15, at least 20, atleast 30, at least 40, or at least 50 reporters. In some instances,there are 2 to 50, 3 to 40, 4 to 30, 5 to 20, or 6 to 10 differentreporters.

In some instances, systems comprise a D2S effector protein and areporter nucleic acid configured to undergo transcollateral cleavage bythe D2S effector protein. Transcollateral cleavage of the reporter maygenerate a signal from reporter or alter a signal from the reporter. Insome instances, the signal is an optical signal, such as a fluorescencesignal or absorbance band. Transcollateral cleavage of the reporter mayalter the wavelength, intensity, or polarization of the optical signal.For example, the reporter may comprise a fluorophore and a quencher,such that transcollateral cleavage of the reporter separates thefluorophore and the quencher thereby increasing a fluorescence signalfrom the fluorophore. Herein, detection of reporter cleavage todetermine the presence of a target nucleic acid sequence may be referredto as ‘DETECTR’. In some instances described herein is a method ofassaying for a target nucleic acid in a sample comprising contacting thetarget nucleic acid with an effector protein, a non-naturally occurringguide nucleic acid that hybridizes to a segment of the target nucleicacid, and a reporter nucleic acid, and assaying for a change in asignal, wherein the change in the signal is produced by cleavage of thereporter nucleic acid.

In the presence of a large amount of non-target nucleic acids, anactivity of an effector protein (e.g., a D2S effector protein asdisclosed herein) may be inhibited. This is because the activatedeffector proteins collaterally cleave any nucleic acids. If totalnucleic acids are present in large amounts, they may outcompetereporters for the effector proteins. In some instances, systems comprisean excess of reporter(s), such that when the system is operated and asolution of the system comprising the reporter is combined with a samplecomprising a target nucleic acid, the concentration of the reporter inthe combined solution-sample is greater than the concentration of thetarget nucleic acid. In some instances, the sample comprises amplifiedtarget nucleic acid. In some instances, the sample comprises anunamplified target nucleic acid. In some instances, the concentration ofthe reporter is greater than the concentration of target nucleic acidsand non-target nucleic acids. The non-target nucleic acids may be fromthe original sample, either lysed or unlysed. The non-target nucleicacids may comprise byproducts of amplification. In some instances,systems comprise a reporter wherein the concentration of the reporter ina solution 1.5 fold, at least 2 fold, at least 3 fold, at least 4 fold,at least 5 fold, at least 6 fold, at least 7 fold, at least 8 fold, atleast 9 fold, at least 10 fold, at least 11 fold, at least 12 fold, atleast 13 fold, at least 14 fold, at least 15 fold, at least 16 fold, atleast 17 fold, at least 18 fold, at least 19 fold, at least 20 fold, atleast 30 fold, at least 40 fold, at least 50 fold, at least 60 fold, atleast 70 fold, at least 80 fold, at least 90 fold, at least 100 foldexcess of total nucleic acids. 1.5 fold to 100 fold, 2 fold to 10 fold,10 fold to 20 fold, 20 fold to 30 fold, 30 fold to 40 fold, 40 fold to50 fold, 50 fold to 60 fold, 60 fold to 70 fold, 70 fold to 80 fold, 80fold to 90 fold, 90 fold to 100 fold, 1.5 fold to 10 fold, 1.5 fold to20 fold, 10 fold to 40 fold, 20 fold to 60 fold, or 10 fold to 80 foldexcess of total nucleic acids.

Amplification Reagents/Components

In some instances, systems described herein comprise a reagent orcomponent for amplifying a nucleic acid. In some embodiments,amplification and amplifying or grammatical equivalents thereof,comprise a process by which a nucleic acid molecule is enzymaticallycopied to generate a plurality of nucleic acid molecules containing thesame sequence as the original nucleic acid molecule or a distinguishableportion thereof. Non-limiting examples of reagents for amplifying anucleic acid include polymerases, primers, and nucleotides (e.g., dNTPsor rNTPs). In some instances, systems comprise reagents for nucleic acidamplification of a target nucleic acid in a sample. Nucleic acidamplification of the target nucleic acid may improve at least one ofsensitivity, specificity, or accuracy of the assay in detecting thetarget nucleic acid. In some instances, nucleic acid amplification isisothermal nucleic acid amplification, providing for the use of thesystem or system in remote regions or low resource settings withoutspecialized equipment for amplification. In some instances,amplification of the target nucleic acid increases the concentration ofthe target nucleic acid in the sample relative to the concentration ofnucleic acids that do not correspond to the target nucleic acid.

The reagents for nucleic acid amplification may comprise a recombinase,an oligonucleotide primer, a single-stranded DNA binding (SSB) protein,a polymerase, or a combination thereof that is suitable for anamplification reaction. Non-limiting examples of amplification reactionsare transcription mediated amplification (TMA), helicase dependentamplification (HDA), or circular helicase dependent amplification(cHDA), strand displacement amplification (SDA), recombinase polymeraseamplification (RPA), loop mediated amplification (LAMP), exponentialamplification reaction (EXPAR), rolling circle amplification (RCA),ligase chain reaction (LCR), simple method amplifying RNA targets(SMART), single primer isothermal amplification (SPIA), multipledisplacement amplification (MDA), nucleic acid sequence basedamplification (NASBA), hinge-initiated primer-dependent amplification ofnucleic acids (HIP), nicking enzyme amplification reaction (NEAR), andimproved multiple displacement amplification (IMDA).

In some instances, systems comprise a PCR tube, a PCR well or a PCRplate. The wells of the PCR plate may be pre-aliquoted with the reagentfor amplifying a nucleic acid, as well as a guide nucleic acid, aneffector protein, a multimeric complex, or any combination thereof. Thewells of the PCR plate may be pre-aliquoted with a guide nucleic acidtargeting a target sequence, an effector protein capable of beingactivated when complexed with the guide nucleic acid and the targetsequence, and at least one population of a single stranded reporternucleic acid comprising a detection moiety. A user may thus add thebiological sample of interest to a well of the pre-aliquoted PCR plateand measure for the detectable signal with a fluorescent light reader ora visible light reader.

In some instances, systems comprise a PCR plate; a guide nucleic acidtargeting a target sequence; an effector protein capable of beingactivated when complexed with the guide nucleic acid and the targetsequence; and a single stranded reporter nucleic acid comprising adetection moiety, wherein the reporter nucleic acid is capable of beingcleaved by the activated nuclease, thereby generating a detectablesignal.

In some instances, systems comprise a support medium; a guide nucleicacid targeting a target sequence; and an effector protein capable ofbeing activated when complexed with the guide nucleic acid and thetarget sequence. In some instances, nucleic acid amplification isperformed in a nucleic acid amplification region on the support medium.Alternatively, or in combination, the nucleic acid amplification isperformed in a reagent chamber, and the resulting sample is applied tothe support medium.

In some instances, a system for modifying a target nucleic acidcomprises a PCR plate; a guide nucleic acid targeting a target sequence;and an effector protein capable of being activated when complexed withthe guide nucleic acid and the target sequence. The wells of the PCRplate may be pre-aliquoted with the guide nucleic acid targeting atarget sequence, and an effector protein capable of being activated whencomplexed with the guide nucleic acid and the target sequence. A usermay thus add the biological sample of interest to a well of thepre-aliquoted PCR plate.

Often, the nucleic acid amplification is performed for no greater than1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,25, 30, 40, 50, or 60 minutes, or any value 1 to 60 minutes. Sometimes,the nucleic acid amplification is performed for 1 to 60, 5 to 55, 10 to50, 15 to 45, 20 to 40, or 25 to 35 minutes. Sometimes, the nucleic acidamplification reaction is performed at a temperature of around 20-45° C.In some instances, the nucleic acid amplification reaction is performedat a temperature no greater than 20° C., 25° C., 30° C., 35° C., 37° C.,40° C., 45° C., or any value 20° C. to 45° C. In some instances, thenucleic acid amplification reaction is performed at a temperature of atleast 20° C., 25° C., 30° C., 35° C., 37° C., 40° C., or 45° C., or anyvalue 20° C. to 45° C. In some instances, the nucleic acid amplificationreaction is performed at a temperature of 20° C. to 45° C., 25° C. to40° C., 30° C. to 40° C., or 35° C. to 40° C.

Often, systems comprise primers for amplifying a target nucleic acid toproduce an amplification product comprising the target nucleic acid anda PAM. For instance, at least one of the primers may comprise the PAMthat is incorporated into the amplification product duringamplification. The compositions for amplification of target nucleicacids and methods of use thereof, as described herein, are compatiblewith any of the methods disclosed herein including methods of assayingfor at least one base difference (e.g., assaying for a SNP or a basemutation) in a target nucleic acid sequence, methods of assaying for atarget nucleic acid that lacks a PAM by amplifying the target nucleicacid sequence to introduce a PAM, and compositions used in introducing aPAM via amplification into the target nucleic acid sequence.

Additional System Components

In some instances, systems include a package, carrier, or container thatis compartmentalized to receive one or more containers such as vials,tubes, and the like, each of the container(s) comprising one of theseparate elements to be used in a method described herein. Suitablecontainers include, for example, test wells, bottles, vials, and testtubes. In one embodiment, the containers are formed from a variety ofmaterials such as glass, plastic, or polymers. The system or systemsdescribed herein contain packaging materials. Examples of packagingmaterials include, but are not limited to, pouches, blister packs,bottles, tubes, bags, containers, bottles, and any packaging materialsuitable for intended mode of use.

A system may include labels listing contents and/or instructions foruse, or package inserts with instructions for use. A set of instructionswill also typically be included. In one embodiment, a label is on orassociated with the container. In some instances, a label is on acontainer when letters, numbers or other characters forming the labelare attached, molded or etched into the container itself; a label isassociated with a container when it is present within a receptacle orcarrier that also holds the container, e.g., as a package insert. In oneembodiment, a label is used to indicate that the contents are to be usedfor a specific therapeutic application. The label also indicatesdirections for use of the contents, such as in the methods describedherein. After packaging the formed product and wrapping or boxing tomaintain a sterile barrier, the product may be terminally sterilized byheat sterilization, gas sterilization, gamma irradiation, or by electronbeam sterilization. Alternatively, the product may be prepared andpackaged by aseptic processing.

In some instances, systems comprise a solid support. An RNP or effectorprotein may be attached to a solid support. The solid support may be anelectrode or a bead. The bead may be a magnetic bead. Upon cleavage, theRNP is liberated from the solid support and interacts with othermixtures. For example, upon cleavage of the nucleic acid of the RNP, theeffector protein of the RNP flows through a chamber into a mixturecomprising a substrate. When the effector protein meets the substrate, areaction occurs, such as a colorimetric reaction, which is thendetected. As another example, the protein is an enzyme substrate, andupon cleavage of the nucleic acid of the enzyme substrate-nucleic acid,the enzyme flows through a chamber into a mixture comprising the enzyme.When the enzyme substrate meets the enzyme, a reaction occurs, such as acalorimetric reaction, which is then detected.

Certain System Conditions

In some instances, systems and methods are employed under certainconditions that enhance an activity of the effector protein relative toalternative conditions, as measured by a detectable signal released fromcleavage of a reporter in the presence of the target nucleic acid. Thedetectable signal may be generated at about the rate of transcollateralcleavage of a reporter nucleic acid. In some instances, the reporternucleic acid is a homopolymeric reporter nucleic acid comprising 5 to 20consecutive adenines, 5 to 20 consecutive thymines, 5 to 20 consecutivecytosines, or 5 to 20 consecutive guanines. In some instances, thereporter is an RNA-FQ reporter.

In some instances, effector proteins disclosed herein recognize, bind,or are activated by, different target nucleic acids having differentsequences, but are active toward the same reporter nucleic acid,allowing for facile multiplexing in a single assay having a singlessRNA-FQ reporter.

In some instances, systems are employed under certain conditions thatenhance transcollateral cleavage activity of an effector protein. Insome instances, under certain conditions, transcolatteral cleavageoccurs at a rate of at least 0.005 mmol/min, at least 0.01 mmol/min, atleast 0.05 mmol/min, at least 0.1 mmol/min, at least 0.2 mmol/min, atleast 0.5 mmol/min, or at least 1 mmol/min. In some instances, systemsand methods are employed under certain conditions that enhancecis-cleavage activity of the effector protein.

Certain conditions that may enhance the activity of an effector proteininclude a certain salt presence or salt concentration of the solution inwhich the activity occurs. For example, cis-cleavage activity of aneffector protein may be inhibited or halted by a high saltconcentration. The salt may be a sodium salt, a potassium salt, or amagnesium salt. In some instances, the salt is NaCl. In some instances,the salt is KNO3. In some instances, the salt concentration is less than150 mM, less than 125 mM, less than 100 mM, less than 75 mM, less than50 mM, or less than 25 mM.

Certain conditions that may enhance the activity of an effector proteinincludes the pH of a solution in which the activity. For example,increasing pH may enhance transcollateral activity. For example, therate of transcollateral activity may increase with increase in pH up topH 9. In some instances, the pH is about 7, about 7.1, about 7.2, about7.3, about 7.4, about 7.5, about 7.6, about 7.7, about 7.8, about 7.9,about 8, about 8.1, about 8.2, about 8.3, about 8.4, about 8.5, about8.6, about 8.7, about 8.8, about 8.9, or about 9. In some instances, thepH is 7 to 7.5, 7.5 to 8, 8 to 8.5, 8.5 to 9, or 7 to 8.5. In someinstances, the pH is less than 7. In some instances, the pH is greaterthan 7.

Certain conditions that may enhance the activity of an effector proteinincludes the temperature at which the activity is performed. In someinstances, the temperature is about 25° C. to about 50° C. In someinstances, the temperature is about 20° C. to about 40° C., about 30° C.to about 50° C., or about 40° C. to about 60° C. In some instances, thetemperature is about 25° C., about 30° C., about 35° C., about 40° C.,about 45° C., or about 50° C.

In some instances, a final concentration an effector protein in a bufferof a system is 1 pM to 1 nM, 1 pM to 10 pM, 10 pM to 100 pM, 100 pM to 1nM, 1 nM to 10 nM, 10 nM to 20 nM, 20 nM to 30 nM, 30 nM to 40 nM, 40 nMto 50 nM, 50 nM to 60 nM, 60 nM to 70 nM, 70 nM to 80 nM, 80 nM to 90nM, 90 nM to 100 nM, 100 nM to 200 nM, 200 nM to 300 nM, 300 nM to 400nM, 400 nM to 500 nM, 500 nM to 600 nM, 600 nM to 700 nM, 700 nM to 800nM, 800 nM to 900 nM, 900 nM to 1000 nM. The final concentration of thesgRNA complementary to the target nucleic acid may be 1 pM to 1 nM, 1 pMto 10 pM, 10 pM to 100 pM, 100 pM to 1 nM, 1 nM to 10 nM, 10 nM to 20nM, 20 nM to 30 nM, 30 nM to 40 nM, 40 nM to 50 nM, 50 nM to 60 nM, 60nM to 70 nM, 70 nM to 80 nM, 80 nM to 90 nM, 90 nM to 100 nM, 100 nM to200 nM, 200 nM to 300 nM, 300 nM to 400 nM, 400 nM to 500 nM, 500 nM to600 nM, 600 nM to 700 nM, 700 nM to 800 nM, 800 nM to 900 nM, 900 nM to1000 nM. The concentration of the ssDNA-FQ reporter may be 1 pM to 1 nM,1 pM to 10 pM, 10 pM to 100 pM, 100 pM to 1 nM, 1 nM to 10 nM, 10 nM to20 nM, 20 nM to 30 nM, 30 nM to 40 nM, 40 nM to 50 nM, 50 nM to 60 nM,60 nM to 70 nM, 70 nM to 80 nM, 80 nM to 90 nM, 90 nM to 100 nM, 100 nMto 200 nM, 200 nM to 300 nM, 300 nM to 400 nM, 400 nM to 500 nM, 500 nMto 600 nM, 600 nM to 700 nM, 700 nM to 800 nM, 800 nM to 900 nM, 900 nMto 1000 nM.

In some instances, systems comprise an excess volume of solutioncomprising the guide nucleic acid, the effector protein and thereporter, which contacts a smaller volume comprising a sample with atarget nucleic acid. The smaller volume comprising the sample may beunlysed sample, lysed sample, or lysed sample which has undergone anycombination of reverse transcription, amplification, and in vitrotranscription. The presence of various reagents, (such as buffer,magnesium sulfate, salts, the pH, a reducing agent, primers, dNTPs,NTPs, cellular lysates, non-target nucleic acids, primers, or othercomponents), in a crude, non-lysed sample, a lysed sample, or a lysedand amplified sample, may inhibit the ability of the effector protein tobecome activated or to find and cleave the nucleic acid of the reporter.This may be due to nucleic acids that are not the reporter outcompetingthe nucleic acid of the reporter, for the effector protein.Alternatively, various reagents in the sample may simply inhibit theactivity of the effector protein. Thus, the compositions and methodsprovided herein for contacting an excess volume comprising theengineered guide nucleic acid, the effector protein, and the reporter toa smaller volume comprising the sample with the target nucleic acid ofinterest provides for superior detection of the target nucleic acid byensuring that the effector protein is able to find and cleaves thenucleic acid of the reporter. In some instances, the volume comprisingthe guide nucleic acid, the effector protein, and the reporter (may bereferred to as “a second volume”) is 4-fold greater than a volumecomprising the sample (may be referred to as “a first volume”). In someinstances, the volume comprising the guide nucleic acid, the effectorprotein, and the reporter (may be referred to as “a second volume”) isat least 1.5 fold, at least 2 fold, at least 3 fold, at least 4 fold, atleast 5 fold, at least 6 fold, at least 7 fold, at least 8 fold, atleast 9 fold, at least 10 fold, at least 11 fold, at least 12 fold, atleast 13 fold, at least 14 fold, at least 15 fold, at least 16 fold, atleast 17 fold, at least 18 fold, at least 19 fold, at least 20 fold, atleast 30 fold, at least 40 fold, at least 50 fold, at least 60 fold, atleast 70 fold, at least 80 fold, at least 90 fold, at least 100 fold,1.5 fold to 100 fold, 2 fold to 10 fold, 10 fold to 20 fold, 20 fold to30 fold, 30 fold to 40 fold, 40 fold to 50 fold, 50 fold to 60 fold, 60fold to 70 fold, 70 fold to 80 fold, 80 fold to 90 fold, 90 fold to 100fold, 1.5 fold to 10 fold, 1.5 fold to 20 fold, 10 fold to 40 fold, 20fold to 60 fold, or 10 fold to 80 fold greater than a volume comprisingthe sample (may be referred to as “a first volume”). In some instances,the volume comprising the sample is at least 0.5 μL, at least 1 μL, atleast at least 1 μL, at least 2 μL, at least 3 μL, at least 4 μL, atleast 5 μL, at least 6 μL, at least 7 μL, at least 8 μL, at least 9 μL,at least 10 μL, at least 11 μL, at least 12 μL, at least 13 μL, at least14 μL, at least 15 μL, at least 16 μL, at least 17 μL, at least 18 μL,at least 19 μL, at least 20 μL, at least 25 μL, at least 30 μL, at least35 μL, at least 40 μL, at least 45 μL, at least 50 μL, at least 55 μL,at least 60 μL, at least 65 μL, at least 70 μL, at least 75 μL, at least80 μL, at least 85 μL, at least 90 μL, at least 95 μL, at least 100 μL,0.5 μL to 5 μL μL, 5 μL to 10 μL, 10 μL to 15 μL, 15 μL to 20 μL, 20 μLto 25 μL, 25 μL to 30 μL, 30 μL to 35 μL, 35 μL to 40 μL, 40 μL to 45μL, 45 μL to 50 μL, 10 μL to 20 μL, 5 μL to 20 μL, 1 μL to 40 μL, 2 μLto 10 μL, or 1 μL to 10 μL. In some instances, the volume comprising theeffector protein, the guide nucleic acid, and the reporter is at least10 μL, at least 11 μL, at least 12 μL, at least 13 μL, at least 14 μL,at least 15 μL, at least 16 μL, at least 17 μL, at least 18 μL, at least19 μL, at least 20 μL, at least 21 μL, at least 22 μL, at least 23 μL,at least 24 μL, at least 25 μL, at least 26 μL, at least 27 μL, at least28 μL, at least 29 μL, at least 30 μL, at least 40 μL, at least 50 μL,at least 60 μL, at least 70 μL, at least 80 μL, at least 90 μL, at least100 μL, at least 150 μL, at least 200 μL, at least 250 μL, at least 300μL, at least 350 μL, at least 400 μL, at least 450 μL, at least 500 μL,10 μL to 15 μL μL, 15 μL to 20 μL, 20 μL to 25 μL, 25 μL to 30 μL, 30 μLto 35 μL, 35 μL to 40 μL, 40 μL to 45 μL, 45 μL to 50 μL, 50 μL to 55μL, 55 μL to 60 μL, 60 μL to 65 μL, 65 μL to 70 μL, 70 μL to 75 μL, 75μL to 80 μL, 80 μL to 85 μL, 85 μL to 90 μL, 90 μL to 95 μL, 95 μL to100 μL, 100 μL to 150 μL, 150 μL to 200 μL, 200 μL to 250 μL, 250 μL to300 μL, 300 μL to 350 μL, 350 μL to 400 μL, 400 μL to 450 μL, 450 μL to500 μL, 10 μL to 20 μL, 10 μL to 30 μL, 25 μL to 35 μL, 10 μL to 40 μL,20 μL to 50 μL, 18 μL to 28 μL, or 17 μL to 22 μL.

In some instances, systems comprise an effector protein that nicks atarget nucleic acid, thereby producing a nicked product. In someinstances, systems cleave a target nucleic acid, thereby producing alinearized product. In some instances, systems produce at least 50%, atleast 55%, at least 60%, at least 65%, at least 70%, at least 75%, atleast 80%, at least 85%, at least 90 or at least 95% of a maximum amountof nicked product within 1 minute, where the maximum amount of nickedproduct is the maximum amount detected within a 60 minute period fromwhen the target nucleic acid is mixed with the effector protein or themultimeric complex thereof. In some instances, systems produce at least50%, at least 55%, at least 60%, at least 65%, at least 70%, at least75%, at least 80%, at least 85%, at least 90 or at least 95% of amaximum amount of linearized product within 1 minute, where the maximumamount of linearized product is the maximum amount detected within a 60minute period from when the target nucleic acid is mixed with theeffector protein. In some instances, at least 80% of the maximum amountof linearized product is produced within 1 minute. In some instances, atleast 90% of the maximum amount of linearized product is produced within1 minute.

X. Methods and Formulations for Introducing Systems and Compositionsinto a Target Cell

A guide RNA (or a nucleic acid comprising a nucleotide sequence encodingsame) and/or an effector protein described herein can be introduced intoa host cell by any of a variety of well-known methods. As a non-limitingexample, a guide RNA and/or effector protein can be combined with alipid. As another non-limiting example, a guide RNA and/or effectorprotein can be combined with a particle, or formulated into a particle.

Methods for Introducing Systems and Compositions to a Host

Described herein are methods of introducing various components describedherein to a host. A host can be any suitable host, such as a host cell.When described herein, a host cell can be an in vivo or in vitroeukaryotic cell, a prokaryotic cell (e.g., bacterial or archaeal cell),or a cell from a multicellular organism (e.g., a cell line) cultured asa unicellular entity, which eukaryotic or prokaryotic cells can be, orhave been, used as recipients for methods of introduction describedherein, and include the progeny of the original cell which has beentransformed by the methods of introduction described herein. It isunderstood that the progeny of a single cell may not necessarily becompletely identical in morphology or in genomic or total DNA complementas the original parent, due to natural, accidental, or deliberatemutation. A host cell can be a recombinant host cell or a geneticallymodified host cell, if a heterologous nucleic acid, e.g., an expressionvector, has been introduced into the cell.

Methods of introducing a nucleic acid and/or protein into a host cellare known in the art, and any convenient method can be used to introducea subject nucleic acid (e.g., an expression construct/vector) into atarget cell (e.g., a human cell, and the like). Suitable methodsinclude, e.g., viral infection, transfection, conjugation, protoplastfusion, lipofection, electroporation, calcium phosphate precipitation,polyethyleneimine (PEI)-mediated transfection, DEAE-dextran mediatedtransfection, liposome-mediated transfection, particle gun technology,calcium phosphate precipitation, direct micro injection,nanoparticle-mediated nucleic acid delivery (see, e.g., Panyam et al.Adv Drug Deliv Rev. 2012 Sep. 13. pii: 50169-409X(12)00283-9. doi:10.1016/j.addr.2012.09.023), and the like. In some instances, the nucleiacid and/or protein are introduced into a disease cell comprised in apharmaceutical composition comprising the guide RNA and/or D2S effectorprotein and a pharmaceutically acceptable excipient.

In certain embodiments, molecules of interest, such as nucleic acids ofinterest, are introduced to a host. In certain embodiments,polypeptides, such as a effector protein are introduced to a host. Incertain embodiments, vectors, such as lipid particles and/or viralvectors can be introduced to a host. Introduction can be for contactwith a host or for assimilation into the host, for example, introductioninto a host cell.

In some instances, described herein are methods of introducing one ormore nucleic acids, such as a nucleic acid encoding a effector protein,a nucleic acid encoding an engineered guide nucleic acid, and/or a donornucleic acid, or combinations thereof, into a host cell. Any suitablemethod can be used to introduce a nucleic acid into a cell. Suitablemethods include, for example, viral infection, transfection,lipofection, electroporation, calcium phosphate precipitation,polyethyleneimine (PEI)-mediated transfection, DEAE-dextran mediatedtransfection, liposome-mediated transfection, particle gun technology,calcium phosphate precipitation, direct microinjection,nanoparticle-mediated nucleic acid delivery, and the like. Furthermethods are described throughout.

Introducing one or more nucleic acids into a host cell can occur in anyculture media and under any culture conditions that promote the survivalof the cells. Introducing one or more nucleic acids into a host cell canbe carried out in vivo or ex vivo. Introducing one or more nucleic acidsinto a host cell can be carried out in vitro.

In some embodiments, a effector protein can be provided as RNA. The RNAcan be provided by direct chemical synthesis or may be transcribed invitro from a DNA (e.g., encoding the effector protein). Oncesynthesized, the RNA may be introduced into a cell by way of anysuitable technique for introducing nucleic acids into cells (e.g.,microinjection, electroporation, transfection, etc.). In someembodiments, introduction of one or more nucleic acid can be through theuse of a vector and/or a vector system, accordingly, in someembodiments, compositions and system described herein comprise a vectorand/or a vector system.

Vectors may be introduced directly to a host. In certain embodiments,host cells can be contacted with one or more vectors as describedherein, and in certain embodiments, said vectors are taken up by thecells. Methods for contacting cells with vectors include but are notlimited to electroporation, calcium chloride transfection,microinjection, lipofection, micro-injection, contact with the cell orparticle that comprises a molecule of interest, or a package of cells orparticles that comprise molecules of interest.

Components described herein can also be introduced directly to a host.For example, an engineered guide nucleic acid can be introduced to ahost, specifically introduced into a host cell. Methods of introducingnucleic acids, such as RNA into cells include, but are not limited todirect injection, transfection, or any other method used for theintroduction of nucleic acids.

Polypeptides (e.g., effector proteins) described herein can also beintroduced directly to a host. In some embodiments, polypeptidesdescribed herein can be modified to promote introduction to a host. Forexample, polypeptides described herein can be modified to increase thesolubility of the polypeptide. Such a polypeptide may optionally befused to a polypeptide domain that increases solubility. The domain maybe linked to the polypeptide through a defined protease cleavage site,such as TEV sequence which is cleaved by TEV protease. The linker mayalso include one or more flexible sequences, e.g. from 1 to 10 glycineresidues. In some embodiments, the cleavage of the polypeptide isperformed in a buffer that maintains solubility of the product, e.g. inthe presence of from 0.5 to 2 M urea, in the presence of polypeptidesand/or polynucleotides that increase solubility, and the like. Domainsof interest include endosomolytic domains, e.g. influenza HA domain; andother polypeptides that aid in production, e.g. IF2 domain, GST domain,GRPE domain, and the like. In another example, the polypeptide can bemodified to improve stability. For example, the polypeptides may bePEGylated, where the polyethyleneoxy group provides for enhancedlifetime in the blood stream. Polypeptides can also be modified topromote uptake by a host, such as a host cell. For example, apolypeptide described herein can be fused to a polypeptide permeantdomain to promote uptake by a host cell. Any suitable permeant domainscan be used in the non-integrating polypeptides of the presentdisclosure, including peptides, peptidomimetics, and non-peptidecarriers. Examples include penetratin, a permeant peptide may be derivedfrom the third alpha helix of Drosophila melanogaster transcriptionfactor Antennapaedia; the HIV-1 tat basic region amino acid sequence,e.g., amino acids 49-57 of a naturally-occurring tat protein; andpoly-arginine motifs, for example, the region of amino acids 34-56 ofHIV-1 rev protein, nonaarginine, octa-arginine, and the like. The siteat which the fusion is made may be selected in order to optimize thebiological activity, secretion or binding characteristics of thepolypeptide. The optimal site can be determined by suitable methods.

Formulations for Introducing Systems and Compositions to a Host

Described herein are formulations of introducing systems andcompositions described herein to a host. In some embodiments, suchformulations, systems and compositions described herein comprise aneffector protein and a carrier (e.g., excipient, diluent, vehicle, orfilling agent).

In some aspects of the present invention the effector protein isprovided in a pharmaceutical composition comprising the effector proteinand any pharmaceutically acceptable excipient, carrier, or diluent. Insome embodiments, a pharmaceutically acceptable excipient, carrier ordiluent can describe any substance formulated alongside the activeingredient of a pharmaceutical composition that allows the activeingredient to retain biological activity and is non-reactive with thesubject's immune system. Such a substance can be included for thepurpose of long-term stabilization, bulking up solid formulations thatcontain potent active ingredients in small amounts, or to confer atherapeutic enhancement on the active ingredient in the final dosageform, such as facilitating absorption, reducing viscosity, or enhancingsolubility. The selection of appropriate substance can depend upon theroute of administration and the dosage form, as well as the activeingredient and other factors. Compositions having such substances can beformulated by well-known conventional methods (see, e.g., Remington'sPharmaceutical Sciences, 18th edition, A. Gennaro, ed., Mack PublishingCo., Easton, Pa., 1990; and Remington, The Science and Practice ofPharmacy 21st Ed. Mack Publishing, 2005).

In some embodiments, a pharmaceutically acceptable excipient, carrier ordiluent, comprises any substance formulated alongside the activeingredient of a pharmaceutical composition that allows the activeingredient to retain biological activity and is non-reactive with thesubject's immune system. Such a substance can be included for thepurpose of long-term stabilization, bulking up solid formulations thatcontain potent active ingredients in small amounts, or to confer atherapeutic enhancement on the active ingredient in the final dosageform, such as facilitating absorption, reducing viscosity, or enhancingsolubility. The selection of appropriate substance can depend upon theroute of administration and the dosage form, as well as the activeingredient and other factors. Compositions having such substances can beformulated by well-known conventional methods (see, e.g., Remington'sPharmaceutical Sciences, 18th edition, A. Gennaro, ed., Mack PublishingCo., Easton, Pa., 1990; and Remington, The Science and Practice ofPharmacy 21st Ed. Mack Publishing, 2005).

XI. Pharmaceutical Compositions and Modes of Delivery

Disclosed herein, in some aspects, are pharmaceutical compositions formodifying a target nucleic acid in a cell or a subject, comprising anyone of the effector proteins, engineered effector proteins, fusioneffector proteins, or guide nucleic acids as described herein and anycombination thereof. In some embodiments, a subject can be a biologicalentity containing expressed genetic materials. The biological entity canbe a plant, animal, or microorganism, including, for example, bacteria,viruses, fungi, and protozoa. The subject can be tissues, cells andtheir progeny of a biological entity obtained in vivo or cultured invitro. The subject can be a mammal. The mammal can be a human. Thesubject may be diagnosed or suspected of being at high risk for adisease. In some instances, the subject is not necessarily diagnosed orsuspected of being at high risk for the disease.

Also disclosed herein, in some aspects, are pharmaceutical compositionscomprising a nucleic acid encoding any one of the effector proteins,engineered effector proteins, fusion effector proteins, or guide nucleicacids as described herein and any combination thereof. In someembodiments, pharmaceutical compositions comprise a plurality of guidenucleic acids. Pharmaceutical compositions may be used to modify atarget nucleic acid or the expression thereof in a cell in vitro, invivo or ex vivo.

In some embodiments, pharmaceutical compositions comprise one or morenucleic acids encoding a effector protein, fusion effector protein,fusion partner, a guide nucleic acid, or a combination thereof; and apharmaceutically acceptable excipient, carrier or diluent.

The effector protein, fusion effector protein, fusion partner protein,or combination thereof may be any one of those described herein. The oneor more nucleic acids may comprise a plasmid. The one or more nucleicacids may comprise a nucleic acid expression vector. The one or morenucleic acids may comprise a viral vector. In some embodiments, theviral vector is a lentiviral vector. In some embodiments, the vector isan adeno-associated viral (AAV) vector. In some embodiments,compositions, including pharmaceutical compositions, comprise a viralvector encoding a fusion effector protein and a guide nucleic acid,wherein at least a portion of the guide nucleic acid binds to theeffector protein of the fusion effector protein.

In some embodiments, pharmaceutical compositions comprise a viruscomprising a viral vector encoding a fusion effector protein, aneffector protein, a fusion partner, a guide nucleic acid, or acombination thereof; and a pharmaceutically acceptable carrier ordiluent. The virus may be a lentivirus. The virus may be an adenovirus.The virus may be a non-replicating virus. The virus may be anadeno-associated virus (AAV). The viral vector may be a retroviralvector. Retroviral vectors may include gamma-retroviral vectors such asvectors derived from the Moloney Murine Leukemia Virus (MoMLV, MMLV,MuLV, or MLV) or the Murine Stem cell Virus (MSCV) genome. Retroviralvectors may include lentiviral vectors such as those derived from thehuman immunodeficiency virus (HIV) genome. In some embodiments, theviral vector is a chimeric viral vector, comprising viral portions fromtwo or more viruses. In some embodiments, the viral vector is arecombinant viral vector.

In some embodiments, when describing recombinant proteins, polypeptides,peptides and nucleic acids can describe proteins, polypeptides, peptidesand nucleic acids that are products of various combinations of cloning,restriction, and/or ligation steps resulting in a construct having astructural coding or non-coding sequence distinguishable from endogenousnucleic acids found in natural systems. Generally, DNA sequencesencoding the structural coding sequence can be assembled from cDNAfragments and short oligonucleotide linkers, or from a series ofsynthetic oligonucleotides, to provide a synthetic nucleic acid which iscapable of being expressed from a recombinant transcriptional unitcontained in a cell or in a cell-free transcription and translationsystem. Such sequences can be provided in the form of an open readingframe uninterrupted by internal non translated sequences, or introns,which are typically present in eukaryotic genes. Genomic DNA comprisingthe relevant sequences can also be used in the formation of arecombinant gene or transcriptional unit. Sequences of non-translatedDNA may be present 5′ or 3′ from the open reading frame, where suchsequences do not interfere with manipulation or expression of the codingregions and may act to modulate production of a desired product byvarious mechanisms. Thus, for example, a recombinant polynucleotide or arecombinant nucleic acid can describe one which is not naturallyoccurring, e.g., is made by the artificial combination of two otherwiseseparated segments of sequence through human intervention. Thisartificial combination is often accomplished by either chemicalsynthesis means, or by the artificial manipulation of isolated segmentsof nucleic acids, e.g., by genetic engineering techniques. Such isusually done to replace a codon with a redundant codon encoding the sameor a conservative amino acid, while typically introducing or removing asequence recognition site. Alternatively, it is performed to jointogether nucleic acid segments of desired functions to generate adesired combination of functions. Similarly, a recombinant polypeptideor recombinant protein a can describe one which is not naturallyoccurring, e.g., is made by the artificial combination of two otherwiseseparated segments of amino sequences through human intervention. Thus,for example, a polypeptide that includes a heterologous amino acidsequence is a recombinant polypeptide.

In some embodiments, the viral vector is an AAV. The AAV may be any AAVknown in the art. In some embodiments, the viral vector corresponds to avirus of a specific serotype. In some examples, the serotype is selectedfrom an AAV1 serotype, an AAV2 serotype, AAV3 serotype, an AAV4serotype, AAV5 serotype, an AAV6 serotype, AAV7 serotype, an AAV8serotype, an AAV9 serotype, an AAV10 serotype, an AAV11 serotype, and anAAV12 serotype. In some embodiments the AAV vector is a recombinantvector, a hybrid AAV vector, a chimeric AAV vector, a self-complementaryAAV (scAAV) vector, a single-stranded AAV or any combination thereofscAAV genomes are generally known in the art and contain both DNAstrands which can anneal together to form double-stranded DNA.

In some embodiments, methods of producing delivery vectors hereincomprise packaging a nucleic acid encoding an effector protein and aguide nucleic acid, or a combination thereof, into an AAV vector. Insome embodiments, methods of producing the delivery vector comprises,(a) contacting a cell with at least one nucleic acid encoding: (i) aguide nucleic acid; (ii) a Replication (Rep) gene; and (iii) a Capsid(Cap) gene that encodes an AAV capsid protein; (b) expressing the AAVcapsid protein in the cell; (c) assembling an AAV particle; and (d)packaging a Cas effector encoding nucleic acid into the AAV particle,thereby generating an AAV delivery vector. In some embodiments,promoters, stuffer sequences, and any combination thereof may bepackaged in the AAV vector. In some examples, the AAV vector can package1, 2, 3, 4, or 5 guide nucleic acids or copies thereof. In someembodiments, the AAV vector comprises inverted terminal repeats, e.g., a5′ inverted terminal repeat and a 3′ inverted terminal repeat. In someembodiments, the AAV vector comprises a mutated inverted terminal repeatthat lacks a terminal resolution site.

In some embodiments, a hybrid AAV vector is produced bytranscapsidation, e.g., packaging an inverted terminal repeat (ITR) froma first serotype into a capsid of a second serotype, wherein the firstand second serotypes may be not the same. In some examples, the Rep geneand ITR from a first AAV serotype (e.g., AAV2) may be used in a capsidfrom a second AAV serotype (e.g., AAV9), wherein the first and secondAAV serotypes may be not the same. As a non-limiting example, a hybridAAV serotype comprising the AAV2 ITRs and AAV9 capsid protein may beindicated AAV2/9. In some examples, the hybrid AAV delivery vectorcomprises an AAV2/1, AAV2/2, AAV 2/4, AAV2/5, AAV2/8, or AAV2/9 vector.

In some embodiments, the AAV vector may be a chimeric AAV vector. Insome embodiments, the chimeric AAV vector comprises an exogenous aminoacid or an amino acid substitution, or capsid proteins from two or moreserotypes. In some examples, a chimeric AAV vector may be geneticallyengineered to increase transduction efficiency, selectivity, or acombination thereof.

In some examples, the delivery vector may be a eukaryotic vector, aprokaryotic vector (e.g., a bacterial vector) a viral vector, or anycombination thereof. In some embodiments, the delivery vehicle may be anon-viral vector. In some embodiments, the delivery vehicle may be aplasmid. In some embodiments, the plasmid comprises DNA. In someembodiments, the plasmid comprises RNA. In some examples, the plasmidcomprises circular double-stranded DNA. In some examples, the plasmidmay be linear. In some examples, the plasmid comprises one or more genesof interest and one or more regulatory elements. In some examples, theplasmid comprises a bacterial backbone containing an origin ofreplication and an antibiotic resistance gene or other selectable markerfor plasmid amplification in bacteria. In some examples, the plasmid maybe a minicircle plasmid. In some examples, the plasmid contains one ormore genes that provide a selective marker to induce a target cell toretain the plasmid. In some examples, the plasmid may be formulated fordelivery through injection by a needle carrying syringe. In someexamples, the plasmid may be formulated for delivery viaelectroporation. In some examples, the plasmids may be engineeredthrough synthetic or other suitable means known in the art. For example,in some embodiments, the genetic elements may be assembled byrestriction digest of the desired genetic sequence from a donor plasmidor organism to produce ends of the DNA which may then be readily ligatedto another genetic sequence.

In some embodiments, the vector is a non-viral vector, and a physicalmethod or a chemical method is employed for delivery into the somaticcell. Exemplary physical methods include electroporation, gene gun,sonoporation, magnetofection, or hydrodynamic delivery. Exemplarychemical methods include delivery of the recombinant polynucleotide vialiposomes such as, cationic lipids or neutral lipids; dendrimers;nanoparticles; or cell-penetrating peptides.

In some embodiments, a fusion effector protein as described herein isinserted into a vector. In some embodiments, the vector comprises one ormore promoters, enhancers, ribosome binding sites, RNA splice sites,polyadenylation sites, a replication origin, and/or transcriptionalterminator sequences.

In general, plasmids and vectors described herein comprise at least onepromoter. In some embodiments, the promoters are constitutive promoters.In other embodiments, the promoters are inducible promoters. Inadditional embodiments, the promoters are prokaryotic promoters (e.g.,drive expression of a gene in a prokaryotic cell). In some embodiments,the promoters are eukaryotic promoters, (e.g., drive expression of agene in a eukaryotic cell). Exemplary promoters include, but are notlimited to, CMV, EF1a, SV40, PGK1, Ubc, human beta actin, CAG, TRE, UAS,Ac5, polyhedron, CaMKIIa, GAL1-10, TEF1, GDS, ADH1, CaMV35S, Ubi, H1,U6, CaMV35S, SV40, CMV, and HSV TK promoter. In some embodiments, thepromoter is CMV. In some embodiments, the promoter is EF1a. In someembodiments, the promoter is ubiquitin. In some embodiments, vectors arebicistronic or polycistronic vector (e.g., having or involving two ormore loci responsible for generating a protein) having an internalribosome entry site (IRES) is for translation initiation in acap-independent manner.

In some embodiments, vectors comprise an enhancer Enhancers arenucleotide sequences that have the effect of enhancing promoteractivity. In some embodiments, enhancers augment transcriptionregardless of the orientation of their sequence. In some embodiments,enhancers activate transcription from a distance of several kilobasepairs. Furthermore, enhancers are located optionally upstream ordownstream of a gene region to be transcribed, and/or located within thegene, to activate the transcription. Exemplary enhancers include, butare not limited to, WPRE; CMV enhancers; the R-U5′ segment in LTR ofHTLV-I (Mol. Cell. Biol., Vol. 8(1), p. 466-472, 1988); SV40 enhancer;the intron sequence between exons 2 and 3 of rabbit β-globin (Proc.Natl. Acad. Sci. USA., Vol. 78(3), p. 1527-31, 1981); and the genomeregion of human growth hormone (J Immunol., Vol. 155(3), p. 1286-95,1995).

Pharmaceutical compositions described herein may comprise a salt. Insome embodiments, the salt is a sodium salt. In some embodiments, thesalt is a potassium salt. In some embodiments, the salt is a magnesiumsalt. In some embodiments, the salt is NaCl. In some embodiments, thesalt is KNO3. In some embodiments, the salt is Mg2+ SO42−.

Non-limiting examples of pharmaceutically acceptable carriers anddiluents suitable for the pharmaceutical compositions disclosed hereininclude buffers (e.g., neutral buffered saline, phosphate bufferedsaline); carbohydrates (e.g., glucose, mannose, sucrose, dextran,mannitol); polypeptides or amino acids (e.g., glycine); antioxidants;chelating agents (e.g., EDTA, glutathione); adjuvants (e.g., aluminumhydroxide); surfactants (Polysorbate 80, Polysorbate 20, or PluronicF68); glycerol; sorbitol; mannitol; polyethyleneglycol; andpreservatives.

In some embodiments, pharmaceutical compositions are in the form of asolution (e.g., a liquid). The solution may be formulated for injection,e.g., intravenous or subcutaneous injection. In some embodiments, the pHof the solution is about 7, about 7.1, about 7.2, about 7.3, about 7.4,about 7.5, about 7.6, about 7.7, about 7.8, about 7.9, about 8, about8.1, about 8.2, about 8.3, about 8.4, about 8.5, about 8.6, about 8.7,about 8.8, about 8.9, or about 9. In some embodiments, the pH is 7 to7.5, 7.5 to 8, 8 to 8.5, 8.5 to 9, or 7 to 8.5. In some embodiments, thepH of the solution is less than 7. In some embodiments, the pH isgreater than 7.

In some embodiments, pharmaceutical compositions comprise an: effectorprotein, fusion effector protein, fusion partner, a guide nucleic acid,or a combination thereof; and a pharmaceutically acceptable carrier ordiluent. In some embodiments, pharmaceutical compositions comprise oneor more nucleic acids encoding an: effector protein, fusion effectorprotein, fusion partner, a guide nucleic acid, or a combination thereof;and a pharmaceutically acceptable carrier or diluent. In someembodiments, guide nucleic acid can be a plurality of guide nucleicacids. In some embodiments, pharmaceutical compositions comprise aeffector protein and a guide nucleic acid wherein the effector proteincomprises a sequence that is at least 65%, at least 70%, at least 75%,at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, orat least 98%, at least 99%, or 100% identical to any one of thesequences of 1-45, 202-293, and 728-731 and the guide nucleic acidcomprises a nucleotide sequence of any one of the nucleotide sequencesof SEQ ID NOS: 624, 628, 630, 634, 638, 641, 643, 645, 646, 630, 641,and 827-929.

XII. Methods of Detecting a Target Nucleic Acid

Provided herein are methods of detecting target nucleic acids. Methodsmay comprise detecting target nucleic acids with compositions or systemsdescribed herein. Methods may comprise detecting a target nucleic acidwith systems described herein that comprise a DETECTR assay. Methods maycomprise detecting a target nucleic acid in a sample, e.g., a celllysate, a biological fluid, or environmental sample. Methods maycomprise detecting a target nucleic acid in a cell. In some instances,methods of detecting a target nucleic acid in a sample or cell comprisescontacting the sample or cell with a D2S effector protein or amultimeric complex thereof, a guide nucleic acid, wherein at least aportion of the guide nucleic acid is complementary to at least a portionof the target nucleic acid, and a reporter nucleic acid that is cleavedin the presence of the D2S effector protein, the guide nucleic acid, andthe target nucleic acid, and detecting a signal produced by cleavage ofthe reporter nucleic acid, thereby detecting the target nucleic acid inthe sample. In some instances, methods result in transcollateralcleavage of the reporter nucleic acid. In some instances, methods resultin cis cleavage of the reporter nucleic acid.

In some instances, the effector protein comprises an amino acid sequencethat is at least is at least 70%, at least 75%, at least 80%, at least85%, at least 90%, at least 92%, at least 95%, at least 97%, at least98%, at least 99%, or 100% identical to any one of SEQ ID NOs: 1-45. Insome instances, the effector protein comprises an amino acid sequencethat is at least is at least 70%, at least 75%, at least 80%, at least85%, at least 90%, at least 92%, at least 95%, at least 97%, at least98%, at least 99%, or 100% identical to any one of SEQ ID NOs: 1-45,202-293, or 728-731. In some instances, the amino acid sequence of theeffector protein is at least 70%, at least 75%, at least 80%, at least85%, at least 90%, at least 92%, at least 95%, at least 97%, at least98%, at least 99%, or 100% identical to any one of SEQ ID NOs: 1-45. Insome instances, the amino acid sequence of the effector protein is atleast 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 92%, at least 95%, at least 97%, at least 98%, at least 99%, or100% identical to any one of SEQ ID NOs: 1-45, 202-293, or 728-731. Insome instances, the nucleobase sequence of the guide is at least 70%, atleast 75%, at least 80%, at least 85%, at least 90%, at least 92%, atleast 95%, at least 97%, at least 98%, at least 99%, or 100% identicalto any one of SEQ ID NOs: 149-153. In some instances, the guidecomprises a crRNA nucleobase sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 92%, at least95%, or 100% identical to any one of SEQ ID NOs: 46-90. In someinstances, the guide comprises a tracrRNA nucleobase sequence that is atleast 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 92%, at least 95%, or 100% identical to any one of SEQ ID NOs:91-148.

Methods may comprise contacting the sample to a complex comprising aguide nucleic acid comprising a segment that is reverse complementary toa segment of the target nucleic acid and a D2S effector protein thatexhibits sequence independent cleavage upon forming a complex comprisingthe segment of the guide nucleic acid binding to the segment of thetarget nucleic acid; and assaying for a signal indicating cleavage of atleast some protein-nucleic acids of a population of protein-nucleicacids, wherein the signal indicates a presence of the target nucleicacid in the sample and wherein absence of the signal indicates anabsence of the target nucleic acid in the sample.

Methods may comprise contacting the sample comprising the target nucleicacid with a guide nucleic acid targeting a target nucleic acid segment,a D2S effector protein capable of being activated when complexed withthe guide nucleic acid and the target nucleic acid segment, a singlestranded nucleic acid of a reporter comprising a detection moiety,wherein the nucleic acid of a reporter is capable of being cleaved bythe activated D2S effector protein, thereby generating a firstdetectable signal, cleaving the single stranded nucleic acid of areporter using the D2S effector protein that cleaves as measured by achange in color, and measuring the first detectable signal on thesupport medium.

Methods may comprise contacting the sample or cell with a D2S effectorprotein or a multimeric complex thereof and a guide nucleic acid at atemperature of at least about 25° C., at least about 30° C., at leastabout 35° C., at least about 40° C., at least about 50° C., or at leastabout 65° C. In some instances, the temperature is not greater than 80°C. In some instances, the temperature is about 25° C., about 30° C.,about 35° C., about 40° C., about 45° C., about 50° C., about 55° C.,about 60° C., about 65° C., or about 70° C. In some instances, thetemperature is about 25° C. to about 45° C., about 35° C. to about 55°C., or about 55° C. to about 65° C.

Methods may comprise cleaving a strand of a single-stranded targetnucleic acid with a D2S or a multimeric complex thereof, as assessedwith an in vitro cis-cleavage assay. A cleavage assay can comprise anassay designed to visualize, quantitate or identify cleavage of anucleic acid. In some cases, the cleavage activity may be cis-cleavageactivity. In some cases, the cleavage activity may be trans-cleavageactivity. An example of such an assay (an in vitro cis-cleavage assay).An example of such an assay may follow a procedure comprising: (i)providing equimolar (e.g., 500 nM) amounts of a D2S effector proteincomprising at least 70% sequence identity to any one of SEQ ID NOs: 1-45and a guide nucleic acid at 40 to 45° C. for 5 minutes in pH 7.5Tris-HCl buffer, 40 mM NaCl, 2 mM Ca(NO3)2, 1 mM BME, thereby forming aribonucleoprotein complex comprising a dimer of the D2S effector proteinand the guide nucleic acid; (ii) adding linear dsDNA comprising anucleic acid sequence targeted by the guide nucleic acid and adjacent toa PAM comprising the sequence 5′-TTTA-3′; (iii) incubating the mixtureat 45° C. for 20 minutes, thereby enabling cleavage of the plasmid; (iv)quenching the reaction with EDTA and a protease; and (v) analyzing thereaction products (e.g., viewing the cleaved and uncleaved linear dsDNAwith gel electrophoresis).

In some embodiments, cleave, cleaving, and cleavage, with reference to anucleic acid molecule or nuclease activity of an effector protein,comprise the hydrolysis of a phosphodiester bond of a nucleic acidmolecule that results in breakage of that bond. The result of thisbreakage can be a nick (hydrolysis of a single phosphodiester bond onone side of a double-stranded molecule), single strand break (hydrolysisof a single phosphodiester bond on a single-stranded molecule) or doublestrand break (hydrolysis of two phosphodiester bonds on both sides of adouble-stranded molecule) depending upon whether the nucleic acidmolecule is single-stranded (e.g., ssDNA or ssRNA) or double-stranded(e.g., dsDNA) and the type of nuclease activity being catalyzed by theeffector protein.

In some cases, there is a threshold of detection for methods ofdetecting target nucleic acids. In some instances, methods are notcapable of detecting target nucleic acids that are present in a sampleor solution at a concentration less than or equal to 10 nM. The term“threshold of detection” is used herein to describe the minimal amountof target nucleic acid that must be present in a sample in order fordetection to occur. For example, when a threshold of detection is 10 nM,then a signal can be detected when a target nucleic acid is present inthe sample at a concentration of 10 nM or more. In some cases, thethreshold of detection is less than or equal to 5 nM, 1 nM, 0.5 nM, 0.1nM, 0.05 nM, 0.01 nM, 0.005 nM, 0.001 nM, 0.0005 nM, 0.0001 nM, 0.00005nM, 0.00001 nM, 10 pM, 1 pM, 500 fM, 250 fM, 100 fM, 50 fM, 10 fM, 5 fM,1 fM, 500 attomole (aM), 100 aM, 50 aM, 10 aM, or 1 aM. In some cases,the threshold of detection is in a range of from 1 aM to 1 nM, 1 aM to500 pM, 1 aM to 200 pM, 1 aM to 100 pM, 1 aM to 10 pM, 1 aM to 1 pM, 1aM to 500 fM, 1 aM to 100 fM, 1 aM to 1 fM, 1 aM to 500 aM, 1 aM to 100aM, 1 aM to 50 aM, 1 aM to 10 aM, 10 aM to 1 nM, 10 aM to 500 pM, 10 aMto 200 pM, 10 aM to 100 pM, 10 aM to 10 pM, 10 aM to 1 pM, 10 aM to 500fM, 10 aM to 100 fM, 10 aM to 1 fM, 10 aM to 500 aM, 10 aM to 100 aM, 10aM to 50 aM, 100 aM to 1 nM, 100 aM to 500 pM, 100 pM to 200 pM, 100 aMto 100 pM, 100 aM to 10 pM, 100 aM to 1 pM, 100 aM to 500 fM, 100 aM to100 fM, 100 aM to 1 fM, 100 aM to 500 aM, 500 aM to 1 nM, 500 aM to 500pM, 500 aM to 200 pM, 500 aM to 100 pM, 500 aM to 10 pM, 500 aM to 1 pM,500 aM to 500 fM, 500 aM to 100 fM, 500 aM to 1 fM, 1 fM to 1 nM, 1 fMto 500 pM, 1 fM to 200 pM, 1 fM to 100 pM, 1 fM to 10 pM, 1 fM to 1 pM,10 fM to 1 nM, 10 fM to 500 pM, 10 fM to 200 pM, 10 fM to 100 pM, 10 fMto 10 pM, 10 fM to 1 pM, 500 fM to 1 nM, 500 fM to 500 pM, 500 fM to 200pM, 500 fM to 100 pM, 500 fM to 10 pM, 500 fM to 1 pM, 800 fM to 1 nM,800 fM to 500 pM, 800 fM to 200 pM, 800 fM to 100 pM, 800 fM to 10 pM,800 fM to 1 pM, 1 pM to 1 nM, 1 pM to 500 pM, 1 pM to 200 pM, 1 pM to100 pM, or 1 pM to 10 pM. In some cases, the threshold of detection in arange of from 800 fM to 100 pM, 1 pM to 10 pM, 10 fM to 500 fM, 10 fM to50 fM, 50 fM to 100 fM, 100 fM to 250 fM, or 250 fM to 500 fM. In somecases, the threshold of detection is in a range of from 2 aM to 100 pM,from 20 aM to 50 pM, from 50 aM to 20 pM, from 200 aM to 5 pM, or from500 aM to 2 pM. In some cases, the minimum concentration at which atarget nucleic acid is detected in a sample is in a range of from 1 aMto 1 nM, 10 aM to 1 nM, 100 aM to 1 nM, 500 aM to 1 nM, 1 fM to 1 nM, 1fM to 500 pM, 1 fM to 200 pM, 1 fM to 100 pM, 1 fM to 10 pM, 1 fM to 1pM, 10 fM to 1 nM, 10 fM to 500 pM, 10 fM to 200 pM, 10 fM to 100 pM, 10fM to 10 pM, 10 fM to 1 pM, 500 fM to 1 nM, 500 fM to 500 pM, 500 fM to200 pM, 500 fM to 100 pM, 500 fM to 10 pM, 500 fM to 1 pM, 800 fM to 1nM, 800 fM to 500 pM, 800 fM to 200 pM, 800 fM to 100 pM, 800 fM to 10pM, 800 fM to 1 pM, 1 pM to 1 nM, 1 pM to 500 pM, from 1 pM to 200 pM, 1pM to 100 pM, or 1 pM to 10 pM. In some cases, the minimum concentrationat which a target nucleic acid is detected in a sample is in a range offrom 2 aM to 100 pM, from 20 aM to 50 pM, from 50 aM to 20 pM, from 200aM to 5 pM, or from 500 aM to 2 pM. In some cases, the minimumconcentration at which a single stranded target nucleic acid can bedetected in a sample is in a range of from 1 aM to 100 pM. In somecases, the minimum concentration at which a target nucleic acid can bedetected in a sample is in a range of from 1 fM to 100 pM. In somecases, the minimum concentration at which a single stranded targetnucleic acid can be detected in a sample is in a range of from 10 fM to100 pM. In some cases, the minimum concentration at which a singlestranded target nucleic acid can be detected in a sample is in a rangeof from 800 fM to 100 pM. In some cases, the minimum concentration atwhich a single stranded target nucleic acid can be detected in a sampleis in a range of from 1 pM to 10 pM. In some cases, the devices,systems, fluidic devices, kits, and methods described herein detect atarget single-stranded nucleic acid in a sample comprising a pluralityof nucleic acids such as a plurality of non-target nucleic acids, wherethe target single-stranded nucleic acid is present at a concentration aslow as 1 aM, 10 aM, 100 aM, 500 aM, 1 fM, 10 fM, 500 fM, 800 fM, 1 pM,10 pM, 100 pM, or 1 pM.

In some instances, the target nucleic acid is present in a cleavagereaction at a concentration of about 10 nM, about 20 nM, about 30 nM,about 40 nM, about 50 nM, about 60 nM, about 70 nM, about 80 nM, about90 nM, about 100 nM, about 200 nM, about 300 nM, about 400 nM, about 500nM, about 600 nM, about 700 nM, about 800 nM, about 900 nM, about 1 μM,about 10 μM, or about 100 μM. In some instances, the target nucleic acidis present in the cleavage reaction at a concentration of from 10 nM to20 nM, from 20 nM to 30 nM, from 30 nM to 40 nM, from 40 nM to 50 nM,from 50 nM to 60 nM, from 60 nM to 70 nM, from 70 nM to 80 nM, from 80nM to 90 nM, from 90 nM to 100 nM, from 100 nM to 200 nM, from 200 nM to300 nM, from 300 nM to 400 nM, from 400 nM to 500 nM, from 500 nM to 600nM, from 600 nM to 700 nM, from 700 nM to 800 nM, from 800 nM to 900 nM,from 900 nM to 1 μM, from 1 μM to 10 μM, from 10 μM to 100 μM, from 10nM to 100 nM, from 10 nM to 1 μM, from 10 nM to 10 μM, from 10 nM to 100μM, from 100 nM to 1 μM, from 100 nM to 10 μM, from 100 nM to 100 μM, orfrom 1 μM to 100 μM. In some instances, the target nucleic acid ispresent in the cleavage reaction at a concentration of from 20 nM to 50μM, from 50 nM to 20 μM, or from 200 nM to 5 μM.

In some cases, methods detect a target nucleic acid in less than 60minutes. In some cases, methods detect a target nucleic acid in lessthan about 120 minutes, less than about 110 minutes, less than about 100minutes, less than about 90 minutes, less than about 80 minutes, lessthan about 70 minutes, less than about 60 minutes, less than about 55minutes, less than about 50 minutes, less than about 45 minutes, lessthan about 40 minutes, less than about 35 minutes, less than about 30minutes, less than about 25 minutes, less than about 20 minutes, lessthan about 15 minutes, less than about 10 minutes, less than about 5minutes, less than about 4 minutes, less than about 3 minutes, less thanabout 2 minutes, or less than about 1 minute.

In some cases, methods require at least about 120 minutes, at leastabout 110 minutes, at least about 100 minutes, at least about 90minutes, at least about 80 minutes, at least about 70 minutes, at leastabout 60 minutes, at least about 55 minutes, at least about 50 minutes,at least about 45 minutes, at least about 40 minutes, at least about 35minutes, at least about 30 minutes, at least about 25 minutes, at leastabout 20 minutes, at least about 15 minutes, at least about 10 minutes,or at least about 5 minutes to detect a target nucleic acid. In somecases, the sample is contacted with the reagents for from 5 minutes to120 minutes, from 5 minutes to 100 minutes, from 10 minutes to 90minutes, from 15 minutes to 45 minutes, or from 20 minutes to 35minutes.

In some cases, methods of detecting are performed in less than 10 hours,less than 9 hours, less than 8 hours, less than 7 hours, less than 6hours, less than 5 hours, less than 4 hours, less than 3 hours, lessthan 2 hours, less than 1 hour, less than 50 minutes, less than 45minutes, less than 40 minutes, less than 35 minutes, less than 30minutes, less than 25 minutes, less than 20 minutes, less than 15minutes, less than 10 minutes, less than 9 minutes, less than 8 minutes,less than 7 minutes, less than 6 minutes, or less than 5 minutes. Insome cases, methods of detecting are performed in about 5 minutes toabout 10 hours, about 10 minutes to about 8 hours, about 15 minutes toabout 6 hours, about 20 minutes to about 5 hours, about 30 minutes toabout 2 hours, or about 45 minutes to about 1 hour.

Methods may comprise detecting a detectable signal within 5 minutes ofcontacting the sample and/or the target nucleic acid with the guidenucleic acid and/or the D2S effector protein. In some cases, detectingoccurs within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25,30, 35, 40, 45, 50, 55, 60, 70, 80, 90, 100, 110, or 120 minutes ofcontacting the target nucleic acid. In some instances, detecting occurswithin 1 to 120, 5 to 100, 10 to 90, 15 to 80, 20 to 60, or 30 to 45minutes of contacting the target nucleic acid.

Amplification of a Target Nucleic Acid

Methods may comprise amplifying a target nucleic acid for detectionusing any of the compositions or systems described herein. Amplifyingmay comprise changing the temperature of the amplification reaction,also known as thermal amplification (e.g., PCR). Amplifying may beperformed at essentially one temperature, also known as isothermalamplification. Amplifying may improve at least one of sensitivity,specificity, or accuracy of the detection of the target nucleic acid.

Amplifying may comprise subjecting a target nucleic acid to anamplification reaction selected from transcription mediatedamplification (TMA), helicase dependent amplification (HDA), or circularhelicase dependent amplification (cHDA), strand displacementamplification (SDA), recombinase polymerase amplification (RPA), loopmediated amplification (LAMP), exponential amplification reaction(EXPAR), rolling circle amplification (RCA), ligase chain reaction(LCR), simple method amplifying RNA targets (SMART), single primerisothermal amplification (SPIA), multiple displacement amplification(MDA), nucleic acid sequence based amplification (NASBA),hinge-initiated primer-dependent amplification of nucleic acids (HIP),nicking enzyme amplification reaction (NEAR), and improved multipledisplacement amplification (IMDA).

In some instances, amplification of the target nucleic acid comprisesmodifying the sequence of the target nucleic acid. For example,amplification may be used to insert a PAM sequence into a target nucleicacid that lacks a PAM sequence. In some cases, amplification may be usedto increase the homogeneity of a target nucleic acid in a sample. Forexample, amplification may be used to remove a nucleic acid variationthat is not of interest in the target nucleic acid sequence.

Amplifying may take 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,16, 17, 18, 19, 20, 25, 30, 40, 50, or 60 minutes. Amplifying may beperformed at a temperature of around 20-45° C. Amplifying may beperformed at a temperature of less than about 20° C., less than about25° C., less than about 30° C., 35° C., less than about 37° C., lessthan about 40° C., or less than about 45° C. The nucleic acidamplification reaction may be performed at a temperature of at leastabout 20° C., at least about 25° C., at least about 30° C., at leastabout 35° C., at least about 37° C., at least about 40° C., or at leastabout 45° C.

Certain Methods of Detection

An illustrative method for detecting a target nucleic acid molecule in asample comprises contacting the sample comprising the target nucleicacid molecule with (i) a D2S effector protein comprising at least 75%sequence identity to a sequence selected from the group consisting ofSEQ ID NOs: 1-45, 202-293, or 728-731; (ii) an engineered guide nucleicacid comprising a region that binds to the effector protein and anadditional region that binds to the target nucleic acid; and (iii) alabeled, single stranded RNA reporter; cleaving the labeled singlestranded RNA reporter by the effector protein to release a detectablelabel; and detecting the target nucleic acid by measuring a signal fromthe detectable label. In some instances, the nucleobase sequence of theguide nucleic acid is at least 70%, at least 75%, at least 80%, at least85%, at least 90%, at least 92%, at least 95%, at least 97%, at least98%, at least 99%, or 100% identical to any one of SEQ ID NOs: 149-153.In some instances, the guide nucleic acid comprises a crRNA sequencethat is at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 92%, at least 95%, or 100% identical to any one of SEQ IDNOs: 46-90. In some instances, the guide nucleic acid comprises atracrRNA sequence that is at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 92%, at least 95%, or 100% identicalto any one of SEQ ID NOs: 91-148.

A further illustrative method for detecting a target nucleic acidmolecule in a sample comprises contacting the sample comprising thetarget nucleic acid molecule with (i) a dimeric protein complexcomprising a D2S effector protein comprising at least 75% sequenceidentity to a sequence selected from the group consisting of SEQ ID NOs:1-45, 202-293, or 728-731; (ii) an engineered guide nucleic acidcomprising a first region that binds to the target nucleic acid; (iii) anucleic acid comprising a first region that binds to the effectorprotein and an additional region that hybridizes to second region of theengineered guide nucleic acid; and (iv) a labeled, single stranded RNAreporter; cleaving the labeled single stranded RNA reporter by theeffector protein to release a detectable label; and detecting the targetnucleic acid by measuring a signal from the detectable label. In someinstances, the nucleobase sequence of the guide nucleic acid is at least70%, at least 75%, at least 80%, at least 85%, at least 90%, at least92%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to any one of SEQ ID NOs: 149-153. In some instances, theguide nucleic acid comprises a crRNA sequence that is at least 70%, atleast 75%, at least 80%, at least 85%, at least 90%, at least 92%, atleast 95%, or 100% identical to any one of SEQ ID NOs: 46-90. In someinstances, the guide nucleic acid comprises a tracrRNA sequence that isat least 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 92%, at least 95%, or 100% identical to any one of SEQ ID NOs:91-148.

XIII. Methods of Nucleic Acid Editing

Provided herein are methods of editing target nucleic acids. In general,editing refers to modifying the nucleobase sequence of a target nucleicacid. However, compositions and systems disclosed herein may also becapable of making epigenetic modifications of target nucleic acids. D2Seffector proteins, multimeric complexes thereof and systems describedherein may be used for editing or modifying a target nucleic acid.Editing a target nucleic acid may comprise one or more of cleaving thetarget nucleic acid, deleting one or more nucleotides of the targetnucleic acid, inserting one or more nucleotides into the target nucleicacid, mutating one or more nucleotides of the target nucleic acid, ormodifying (e.g., methylating, demethylating, deaminating, or oxidizing)of one or more nucleotides of the target nucleic acid.

Methods of editing may comprise contacting a target nucleic acid with aD2S effector protein and a guide nucleic acid, wherein the effectorprotein comprises an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least97%, or at least 98%, at least 99%, or 100% identical to any one of SEQID NOs: 1-45, 202-293, or 728-731. In some instances, the nucleobasesequence of the guide nucleic acid is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 92%, at least 95%, atleast 97%, at least 98%, at least 99%, or 100% identical to any one ofSEQ ID NOs: 149-153. In some instances, the guide nucleic acid comprisesa crRNA sequence that is at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 92%, at least 95%, or 100% identicalto any one of SEQ ID NOs: 46-90. In some instances, the guide nucleicacid comprises a tracrRNA sequence that is at least 70%, at least 75%,at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, or100% identical to any one of SEQ ID NOs: 91-148.

Editing may introduce a mutation (e.g., point mutations, deletions) in atarget nucleic acid relative to a corresponding wildtype nucleobasesequence. Editing may remove or correct a disease-causing mutation in anucleic acid sequence to produce a corresponding wildtype nucleobasesequence. Editing may remove/correct point mutations, deletions, nullmutations, or tissue-specific mutations in a target nucleic acid.Editing may be used to generate gene knock-out, gene knock-in, geneediting, gene tagging, or a combination thereof. Methods of thedisclosure may be targeted to any locus in a genome of a cell.

Editing may comprise single stranded cleavage, double stranded cleavage,donor nucleic acid insertion, epigenetic modification (e.g.,methylation, demethylation, acetylation, or deacetylation), or acombination thereof. In some instances, cleavage (single-stranded ordouble-stranded) is site-specific, meaning cleavage occurs at a specificsite in the target nucleic acid, often within the region of the targetnucleic acid that hybridizes with the guide nucleic acid spacer region.In some cases, the D2S effector proteins introduce a single-strandedbreak in a target nucleic acid to produce a cleaved nucleic acid. Insome cases, the effector protein is capable of introducing a break in asingle stranded RNA (ssRNA). The D2S effector protein may be coupled toa guide nucleic acid that targets a particular region of interest in thessRNA. In some instances, the target nucleic acid, and the resultingcleaved nucleic acid is contacted with a nucleic acid for homologousrecombination (e.g., homology directed repair (HDR)) or non-homologousend joining (NHEJ). In some cases, a double-stranded break in the targetnucleic acid may be repaired (e.g., by NHEJ or HDR) without insertion ofa donor template, such that the repair results in an indel in the targetnucleic acid at or near the site of the double-stranded break.

In some instances, the D2S effector protein is fused to achromatin-modifying enzyme. In some cases, the fusion protein chemicallymodifies the target nucleic acid, for example by methylating,demethylating, or acetylating the target nucleic acid in a sequencespecific or non-specific manner.

Methods may comprise use of two or more D2S effector proteins. Anillustrative method for introducing a break in a target nucleic acidcomprises contacting the target nucleic acid with: (a) a firstengineered guide nucleic acid comprising a region that binds to a firstD2S effector protein, wherein the effector protein comprises at least75% sequence identity to a sequence selected from the group consistingof SEQ ID NOs: 1-45, 202-293, or 728-731; and (b) a second engineeredguide nucleic acid comprising a region that binds to a second D2Seffector protein, wherein the effector protein comprises at least 75%sequence identity to a sequence selected from the group consisting ofSEQ ID NOs: 1-45, 202-293, or 728-731, wherein the first engineeredguide nucleic acid comprises an additional region that binds to thetarget nucleic acid and wherein the second engineered guide nucleic acidcomprises an additional region that binds to the target nucleic acid. Insome instances, the nucleobase sequence of the guide nucleic acid is atleast 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 92%, at least 95%, at least 97%, at least 98%, at least 99%, or100% identical to any one of SEQ ID NOs: 149-153. In some instances, theguide nucleic acid comprises a crRNA sequence that is at least 70%, atleast 75%, at least 80%, at least 85%, at least 90%, at least 92%, atleast 95%, or 100% identical to any one of SEQ ID NOs: 46-90. In someinstances, the guide nucleic acid comprises a tracrRNA sequence that isat least 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 92%, at least 95%, or 100% identical to any one of SEQ ID NOs:91-148.

In some instances, editing a target nucleic acid comprises genomeediting. Genome editing may comprise modifying a genome, chromosome,plasmid, or other genetic material of a cell or organism. In someinstances, the genome, chromosome, plasmid, or other genetic material ofthe cell or organism is modified in vivo. In some instances, the genome,chromosome, plasmid, or other genetic material of the cell or organismis modified in a cell. In some instances, the genome, chromosome,plasmid, or other genetic material of the cell or organism is modifiedin vitro. For example, a plasmid may be modified in vitro using acomposition described herein and introduced into a cell or organism. Insome instances, modifying a target nucleic acid may comprise deleting asequence from a target nucleic acid. For example, a mutated sequence ora sequence associated with a disease may be removed from a targetnucleic acid. In some instances, modifying a target nucleic acid maycomprise replacing a sequence in a target nucleic acid with a secondsequence. For example, a mutated sequence or a sequence associated witha disease may be replaced with a second sequence lacking the mutation orthat is not associated with the disease. In some instances, modifying atarget nucleic acid may comprise introducing a sequence into a targetnucleic acid. For example, a beneficial sequence or a sequence that mayreduce or eliminate a disease may be inserted into the target nucleicacid.

In some instances, methods comprise inserting a donor nucleic acid intoa cleaved target nucleic acid. The donor nucleic acid may be inserted ata specified (e.g., effector protein targeted) point within the targetnucleic acid. In some instances, methods comprise contacting a targetnucleic acid with a D2S effector protein comprising an amino acidsequence that is at least 70%, at least 75%, at least 80%, at least 85%,at least 90%, at least 95%, at least 97%, at least 98%, at least 99%, or100% identical to any one of SEQ ID NOs: 1-45, 202-293, or 728-731,thereby introducing a single-stranded break in the target nucleic acid;contacting the target nucleic acid with a second effector protein,optionally comprising an amino acid sequence that is at least 70%, atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 97%, at least 98%, or at least 99% identical to any one of SEQ IDNOs: 11-45, 202-293, or 728-731, to generate a second cleavage site inthe target nucleic acid, ligating the regions flanking the first andsecond cleavage site, optionally through NHEJ or single-strandannealing, thereby resulting in the excision of a portion of the targetnucleic acid between the first and second cleavage sites from the targetnucleic acid; and contacting the target nucleic acid with a donornucleic acid for homologous recombination, optionally via HDR or NHEJ,thereby introducing a new sequence into the target nucleic acid (e.g.,at a cleavage site or in between two cleavage sites).

In some cases, methods comprise editing a target nucleic acid with twoor more effector proteins. Editing a target nucleic acid may compriseintroducing a two or more single-stranded breaks in a target nucleicacid. In some instances, a break may be introduced by contacting atarget nucleic acid with an effector protein and a guide nucleic acid.The guide nucleic acid may bind to the effector protein, e.g., a D2Seffector protein, and hybridize to a region of the target nucleic acid,thereby recruiting the effector protein to the region of the targetnucleic acid. Binding of the effector protein to the guide nucleic acidand the region of the target nucleic acid may activate the effectorprotein, and the effector protein may introduce a break (e.g., a singlestranded break) in the region of the target nucleic acid. In someinstances, modifying a target nucleic acid may comprise introducing afirst break in a first region of the target nucleic acid and a secondbreak in a second region of the target nucleic acid. For example,modifying a target nucleic acid may comprise contacting a target nucleicacid with a first guide nucleic acid that binds to a first effectorprotein and hybridizes to a first region of the target nucleic acid anda second guide nucleic acid that binds to a second programmable nickaseand hybridizes to a second region of the target nucleic acid. The firsteffector protein, e.g., a D2S effector protein, may introduce a firstbreak in a first strand at the first region of the target nucleic acid,and the second effector protein may introduce a second break in a secondstrand at the second region of the target nucleic acid. In someinstances, a segment of the target nucleic acid between the first breakand the second break may be removed, thereby modifying the targetnucleic acid. In some instances, a segment of the target nucleic acidbetween the first break and the second break may be replaced (e.g., withdonor nucleic acid), thereby modifying the target nucleic acid. In someinstances, the D2S effector protein comprises an amino acid sequencethat is at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to any one of SEQ ID NOs: 1-45, 202-293, or 728-731. In someinstances, the nucleobase sequence of the guide nucleic acid is at least70%, at least 75%, at least 80%, at least 85%, at least 90%, at least92%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to any one of SEQ ID NOs: 149-153. In some instances, theguide nucleic acid comprises a crRNA sequence that is at least 70%, atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, or100% identical to any one of SEQ ID NOs: 46-90. In some instances, theguide nucleic acid comprises a tracrRNA sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, atleast 95%, or 100% identical to any one of SEQ ID NOs: 91-148.

Base editing is a genome editing method that directly generates precisenucleotide changes in genomic DNA or RNA without generating DSBs,requiring a DNA donor template, or relying on cellular homology-directedrepair (HDR). In general, base editors comprise a base editing enzyme(e.g., a deaminase) fused to a catalytically inactive CRISPR-associated(Cas) protein, wherein the catalytically inactive CRISPR-associated(Cas) protein is coupled to a guide nucleic acid that imparts activityor sequence selectivity to the base editor. In some embodiments, theeffector protein is a catalytically inactive effector protein. In someembodiments, the effector protein comprises an amino acid sequence thatis at least 75%, at least 80%, at least 85%, at least 90%, at least 95%,at least 98%, at least 99%, or 100% identical to any one of SEQ ID NOs:728, 729, 730, or 731.

In some embodiments, the amino acid sequence of the effector protein ismodified relative to a naturally-occurring effector protein. Suchmodified effector proteins may be referred to as an engineered effectorprotein. In some embodiments, the engineered effector protein has beenmodified to inactivate a catalytically active nuclease domain (e.g., aRuvC domain, HNH domain) of the naturally-occurring effector protein. Insome embodiments, the engineered effector protein has been modified toreduce the activity of a catalytically active nuclease domain of thenaturally-occurring effector protein. The engineered effector proteinmay have less than 90%, less than 80%, less than 70%, less than 60%,less than 50%, less than 40%, less than 30%, less than 20%, less than10%, less than 5%, or less than 1% of the nucleic acid-cleaving activityas compared to the naturally-occurring effector protein as compared in acleavage assay. In some embodiments, the effector protein has beenmodified to comprise at least 1, at least 2, at least 3, at least 4, orat least 5 amino acid modifications relative to the non-modified version(e.g. wild-type of naturally occurring version) of the effector protein.The amino acid modification(s) may comprise a deletion, insertion, orsubstitution of an amino acid.

In some cases, editing is achieved by fusing an effector protein, e.g.,a D2S effector protein, to a heterologous sequence. The heterologoussequence may be a suitable fusion partner, e.g., a protein that providesrecombinase activity by acting on the target nucleic acid sequence. Insome instances, the fusion protein comprises a D2S effector proteinfused to a heterologous sequence by a linker. The heterologous sequenceor fusion partner may be a base editing domain. The base editing domainmay be an ADAR1/2 or any functional variant thereof. The heterologoussequence or fusion partner may be fused to the C-terminus, N-terminus,or an internal portion (e.g., a portion other than the N- or C-terminus)of the D2S effector protein. The heterologous sequence or fusion partnermay be fused to the D2S effector protein by a linker. A linker may be apeptide linker or a non-peptide linker. In some instances, the linker isan XTEN linker. In some instances, the linker comprises one or morerepeats a tri-peptide GGS (SEQ ID NO: 179). In some instances, thelinker is from 1 to 100 amino acids in length. In some instances, thelinker is more 100 amino acids in length. In some instances, the linkeris from 10 to 27 amino acids in length. A non-peptide linker may be apolyethylene glycol (PEG), polypropylene glycol (PPG),co-poly(ethylene/propylene) glycol, polyoxyethylene (POE), polyurethane,polyphosphazene, polysaccharides, dextran, polyvinyl alcohol,polyvinylpyrrolidones, polyvinyl ethyl ether, polyacrylamide,polyacrylate, polycyanoacrylates, lipid polymers, chitins, hyaluronicacid, heparin, or an alkyl linker.

In some embodiments, heterologous comprises a nucleotide or polypeptidesequence that is not found in a native nucleic acid or protein,respectively. In some embodiments, fusion proteins comprise an effectorprotein and a fusion partner protein, wherein the fusion partner proteinis heterologous to an effector protein. These fusion proteins cancomprise a heterologous protein. A protein that is heterologous to theeffector protein is a protein that is not covalently linked via an amidebond to the effector protein in nature. In some embodiments, aheterologous protein is not encoded by a species that encodes theeffector protein.

Described herein are methods for editing or detecting a target nucleicacid. In some embodiments, the target nucleic acid comprises a portionor a specific region of a nucleic acid from a genomic locus, any DNAamplicon of, a reverse transcribed mRNA, or a cDNA from one or moregenes selected from AAVS1, ABCA4, ABCB11, ABCC8, ABCD1, ACAD9, ACADM,ACADVL, ACAT1, ACOX1, ACSF3, ADA, ADAMTS2, ADGRG1, AGA, AGL, AGPS, AGXT,AHI1, AIRE, ALDH3A2, ALDOB, ALG6, ALK, ALKBH5, ALMS1, ALPL, AMRC9, AMT,ANGPTL3, APC, Apo(a), APOCIII, APOEc4, APOL1, APP, AQP2, AR, ARFRP1,ARG1, ARL13B, ARL6, ARSA, ARSB, ASL, ASNS, ASPA, ASS1, ATM, ATP6V1B1,ATP7A, ATP7B, ATRX, ATXN1, ATXN10, ATXN2, ATXN3, ATXN7, ATXN8OS, AXIN1,AXIN2, B2M, BACE-1, BAK1, BAP1, BARD1, BAX2, BBS1, BBS10, BBS12, BBS2,BCKDHA, BCKDHB, BCL2L2, BCS1L, BEST1, Betaglobin gene, BLM, BMPR1A,BRAFV600E, BRCA1, BRCA2, BRIP1, BSND, C282Y, C9orf72, CA4, CACNA1A,CAPN3, CASR, CBS, CC2D2A, CCR5, CDC73, CDH1, CDH23, CDK11, CDK4, CDKN1B,CDKN1C, CDKN2A, CEBPA, CEP290, CERKL, CFTR, CHCHD10, CHEK2, CHM, CHRNE,CIITA, CLN3, CLN5, CLN6, CLN8, CLRN1, CLTA, CNBP, CNGB1, CNGB3, COL1A1,COL1A2, COL27A1, COL4A3, COL4A4, COL4A5, COL7A1, CPS1, CPT1A, CPT2,CRB1, CRX, CTNNA1, CTNNB1, CTNND2, CTNS, CTSK, CYBA, CYBB, CYP11B1,CYP11B2, CYP17A1, CYP19A1, CYP27A1, DBT, DCLRE1C, DERL2, DFNA36, DFNB31,DGAT2, DHCR7, DHDDS, DICER1, DIS3L2, DLD, DMD, DMPK, DNAH5, DNAI1,DNAI2, DNM2, DNMT1, DYSF, EDA, EDN3, EDNRB, EGFR, EIF2B5, EMC2, EMC3,EMD, EMX1, EPCAM, ERCC6, ERCC8, ESCO2, ETFA, ETFDH, ETHE1, EVC, EVC2,EYS, F5, F9, FactorB, FactorXI, FAH, FAM161A, FANCA, FANCB, FANCC,FANCD1, FANCD2, FANCE, FANCF, FANCG, FANCI, FANCJ, FANCL, FANCM, FANCN,FANCP, FANCS, FBN1, FGF14, FGFR2, FGFR3, FH, FHL1, FKRP, FKTN, FLCN,FMR1, FOXP3, FSCN2, FUS, FUT8, FVIII, FXII, FXN, G6PC, GAA, GALC, GALK1,GALT, GAMT, GATA2, GBA, GBE1, GCDH, GCGR, GDNF, GFAP, GFM1, GHR, GJB1,GJB2, GLA, GLB1, GLDC, GLE1, GNE, GNPTAB, GNPTG, GNS, GPC3, GPR98,GREM1, GRHPR, GRIN2B, H2AX, HADHA, HAX1, HBA1, HBA2, HBB, HEXA, HEXB,HGSNAT, HLCS, HMGCL, HOGA1, HOXB13, HPRPF3, HPRT1, HPS1, HPS3, HRAS,HSD17B4, HSD3B2, HTT, HYAL1, HYLS1, IDS, IDUA, IFITM5, IKBKAP, IL2RG,IMPDH1, INPP5E, IRF4, ITPR1, IVD, JAG1, KCNC3, KCND3, KCNJ11, KLHL7,KRAS, LAMA2, LAMA3, LAMB3, LAMC2, LCA5, LDLR, LDLRAP1, LHX3, LIFR, LIPA,LMNA, LOXHD1, LPL, LRAT, LRP6, LRPPRC, LRRK2, MAN2B1, MAPT, MAX, MCOLN1,MECP2, MED17, MEFV, MEN1, MERTK, MESP2, MET, METex14, MFN2, MFSD8, MITF,MKS1, MLC1, MLH1, MLH3, MMAA, MMAB, MMACHC, MMADHC, MMD, MPI, MPL,MPV17, MSH2, MSH3, MSH6, MTHFR, MTM1, MTRR, MTTP, MUT, MUTYH, MYO7A,NAGLU, NAGS, NBN, NDRG1, NDUFAF5, NDUFS6, NEB, NF1, NF2, NOTCH2, NPC1,NPC2, NPHP1, NPHS1, NPHS2, NR2E3, NTHL1, NTRK, NTRK1, OAT, OCT4, OFD1,OPA3, OTC, PAH, PALB2, PAQR8, PAX3, PC, PCCA, PCCB, PCDH15, PCSK9, PD1,PDCD1, PDE6B, PDGFRA, PDHA1, PDHB, PEX1, PEX10, PEX12, PEX13, PEX14,PEX16, PEX19, PEX2, PEX26, PEX3, PEX5, PEX6, PEX7, PFKM, PHGDH, PHOX2B,PKD1, PKD2, PKHD1, PKK, PLEKHG4, PMM2, PMP22, PMS1, PMS2, PNPLA3, POLD1,POLE, POMGNT1, POT1, POU5F1, PPM1A, PPP2R2B, PPT1, PRCD, PRKAR1A, PRKCG,PRNP, PROM1, PROP1, PRPF31, PRPF8, PRPH2, PRPS1, PSAP, PSD95, PSEN1,PSEN2, PTCH1, PTEN, PTS, PUS1, PYGM, RAB23, RAD50, RAD51C, RAD51D, RAG2,RAPSN, RARS2, RB1, RDH12, RECQL4, RET, RHO, RICTOR, RMRP, ROS1, RP1,RP2, RPE65, RPGR, RPGRIP1L, RPL32P3, RS1, RTEL1, RUNX1, SACS, SAMHD1,SCN1A, SCN2A, SDHA, SDHAF2, SDHB, SDHC, SDHD, SEL1L, SEPSECS, SERPINGLSGCA, SGCB, SGCG, SGSH, SIRT1, SLC12A3, SLC12A6, SLC17A5, SLC22A5,SLC25A13, SLC25A15, SLC26A2, SLC26A4, SLC35A3, SLC37A4, SLC39A4,SLC4A11, SLC6A8, SLC7A7, SMAD4, SMARCA4, SMARCAL1, SMARCB1, SMARCE1,SMN1, SMPD1, SNAI2, SNCA, SNRNP200, SOD1, SOX10, SPARA7, SPTBN2, STAR,STAT3, STK11, SUFU, SUMF1, SYNE1, SYNE2, SYS1, TARDBP, TAT, TBK1, TBP,TCIRG1, TCTN3, TECPR2, TERC, TERT, TFR2, TGFBR2, TGM1, TH, TLE3,TMEM127, TMEM138, TMEM216, TMEM43, TMEM67, TMPRSS6, TOP1, TOPORS, TP53,TPP1, IRAC, TRMU, TSFM, TSPAN14, TTBK2, TTC8, TTPA, TTR, TULP1, TYMP,UBE2G2, UBE2J1, UBE3A, USH1C, USH1G, USH2A, VEGF, VHL, VPS13A, VPS13B,VPS35, VPS45, VRK1, VSX2, VWF, WDR19, WNT10A, WS2B, WS2C, XPA, XPC, XPF,YAP1, ZFYVE26, and ZNF423. Further description of editing or detecting atarget nucleic acid in the foregoing genes can be found in more detailin Kim et al., “Enhancement of target specificity of CRISPR-Cas12a byusing a chimeric DNA-RNA guide”, Nucleic Acids Res. 2020 Sep. 4;48(15):8601-8616; Wang et al., “Specificity profiling of CRISPR systemreveals greatly enhanced off-target gene editing”, Scientific Reportsvolume 10, Article number: 2269 (2020); Tuladhar et al.,“CRISPR-Cas9-based mutagenesis frequently provokes on-target mRNAmisregulation”, Nature Communications volume 10, Article number: 4056(2019); Dong et al., “Genome-Wide Off-Target Analysis in CRISPR-Cas9Modified Mice and Their Offspring”, G3, Volume 9, Issue 11, 1 Nov. 2019,Pages 3645-3651; Winter et al., “Genome-wide CRISPR screen reveals novelhost factors required for Staphylococcus aureus α-hemolysin-mediatedtoxicity”, Scientific Reports volume 6, Article number: 24242 (2016);and Ma et al., “A CRISPR-Based Screen Identifies Genes Essential forWest-Nile-Virus-Induced Cell Death”, Cell Rep. 2015 Jul. 28;12(4):673-83, which are hereby incorporated by reference in theirentirety.

Donor Nucleic Acids

In some embodiments, a donor nucleic acid comprises a nucleic acid thatis incorporated into a target nucleic acid or target sequence. Inreference to a viral vector, a donor nucleic acid comprises a sequenceof nucleotides that will be or has been introduced into a cell followingtransfection of the viral vector. The donor nucleic acid may beintroduced into the cell by any mechanism of the transfecting viralvector, including, but not limited to, integration into the genome ofthe cell or introduction of an episomal plasmid or viral genome. Asanother example, when used in reference to the activity of an effectorprotein, a donor nucleic acid comprises a sequence of nucleotides thatwill be or has been inserted at the site of cleavage by the effectorprotein (cleaving (hydrolysis of a phosphodiester bond) of a nucleicacid resulting in a nick or double strand break-nuclease activity). Asyet another example, when used in reference to homologous recombination,a donor nucleic acid comprises a sequence of DNA that serves as atemplate in the process of homologous recombination, which may carry themodification that is to be or has been introduced into the targetnucleic acid. By using this donor nucleic acid as a template, thegenetic information, including the modification, is copied into thetarget nucleic acid by way of homologous recombination. In someembodiments, a donor nucleotide, comprises a single nucleotide that isincorporated into a target nucleic acid. A nucleotide is typicallyinserted at a site of cleavage by an effector protein.

Donor nucleic acids of any suitable size may be integrated into a targetnucleic acid or genome. In some instances, the donor polynucleotideintegrated into a genome is less than 3, about 3, 3.5, 4, 4.5, 5, 5.5,6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5, 10, 10.5, 11, 11.5, 12, 12.5, 13, 13.5,14, 14.5, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, 100, 150, 200,250, 300, 350, 400, 450, 500 kilobases in length. In some instances,donor nucleic acids are more than 500 kilobases (kb) in length.

The donor nucleic acid may comprise a sequence that is derived from aplant, bacteria, virus or an animal. The animal may be human. The animalmay be a non-human animal, such as, by way of non-limiting example, amouse, rat, hamster, rabbit, pig, bovine, deer, sheep, goat, chicken,cat, dog, ferret, a bird, non-human primate (e.g., marmoset, rhesusmonkey). The non-human animal may be a domesticated mammal or anagricultural mammal.

Genetically Modified Cells and Organisms

Methods of editing described herein may be employed to generate agenetically modified cell. The cell may be a eukaryotic cell (e.g., amammalian cell) or a prokaryotic cell (e.g., an archaeal cell). The cellmay be derived from a multicellular organism and cultured as aunicellular entity. The cell may comprise a heritable geneticmodification, such that progeny cells derived therefrom comprise theheritable genetic mutation. The cell may be progeny of a geneticallymodified cell comprising a genetic modification of the geneticallymodified parent cell. A genetically modified cell may comprise adeletion, insertion, mutation, or non-native sequence relative to awild-type version of the cell or the organism from which the cell wasderived.

Methods may comprise contacting a cell with a nucleic acid (e.g., aplasmid or mRNA) comprising a nucleobase sequence encoding an effectorprotein, e.g., a D2S effector protein, wherein the effector proteincomprises an amino acid sequence that is at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 98%, or at least 99% identical to any one of SEQ ID NOs: 1-45,202-293, or 728-731.

Methods may comprise contacting cells with a nucleic acid (e.g., aplasmid or mRNA) comprising a nucleobase sequence encoding a guidenucleic acid, a tracrRNA, a crRNA, or any combination thereof. In someinstances, the nucleobase sequence of the guide nucleic acid is at least70%, at least 75%, at least 80%, at least 85%, at least 90%, at least92%, at least 95%, at least 97%, at least 98%, at least 99%, or 100%identical to any one of SEQ ID NOs: 149-153. In some instances, theguide nucleic acid comprises a crRNA sequence that is at least 70%, atleast 75%, at least 80%, at least 85%, at least 90%, at least 92%, atleast 95%, at least 97%, at least 98%, at least 99%, or 100% identicalto any one of SEQ ID NOs: 46-90. In some instances, the guide nucleicacid comprises a tracrRNA sequence that is at least 70%, at least 75%,at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, atleast 97%, at least 98%, at least 99%, or 100% identical to any one ofSEQ ID NOs: 91-148. Contacting may comprise electroporation, acousticporation, optoporation, viral vector-based delivery, iTOP, nanoparticledelivery (e.g., lipid or gold nanoparticle delivery), cell-penetratingpeptide (CPP) delivery, DNA nanostructure delivery, or any combinationthereof.

Methods may comprise contacting a cell with an effector protein, e.g., aD2S effector protein or a multimeric complex thereof, wherein theeffector protein comprises an amino acid sequence that is at least 70%,at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 97%, at least 98%, or at least 99% identical to any one of SEQ IDNOs: 11-45, 202-293, or 728-731. Methods may comprise contacting a cellwith an D2S effector effector protein, wherein the amino acid sequenceof the D2S effector protein is at least 70%, at least 75%, at least 80%,at least 85%, at least 90%, at least 95%, at least 97%, at least 98%, orat least 99% identical to any one of SEQ ID NOs: 1-45, 202-293, or728-731.

Methods may comprise cell line engineering (e.g., engineering a cellfrom a cell line for bioproduction). Cell lines may be used to produce adesired protein. In some instances, target nucleic acids comprise agenomic sequence. In some instances, the cell line is a Chinese hamsterovary cell line (CHO), human embryonic kidney cell line (HEK), celllines derived from cancer cells, cell lines derived from lymphocytes,and the like. Non-limiting examples of cell lines includes: C8161,CCRF-CEM, MOLT, mIMCD-3, NHDF, HeLa-S3, Huh1, Huh4, Huh7, HUVEC, HASMC,HEKn, HEKa, MiaPaCell, Panc1, PC-3, TF1, CTLL-2, CIR, Rath, CV1, RPTE,A10, T24, J82, A375, ARH-77, Calu1, SW480, SW620, SKOV3, SK-UT, CaCo2,P388D1, SEM-K2, WEHI-231, HB56, TIB55, Jurkat, J45.01, LRMB, Bcl-1,BC-3, IC21, DLD2, Raw264.7, NRK, NRK-52E, MRC5, MEF, Hep G2, HeLa B,HeLa T4, COS, COS-1, COS-6, COS-M6A, BS-C-1 monkey kidney epithelial,BALB/3T3 mouse embryo fibroblast, 3T3 Swiss, 3T3-L1, 132-d5 human fetalfibroblasts; 10.1 mouse fibroblasts, 293-T, 3T3, 721, 9L, A2780,A2780ADR, A2780cis, A172, A20, A253, A431, A-549, ALC, B16, B35, BCP-1cells, BEAS-2B, bEnd.3, BHK-21, BR 293, BxPC3, C3H-10T1/2, C6/36,Cal-27, CHO, CHO-7, CHO-IR, CHO-K1, CHO-K2, CHO-T, CHO Dhfr −/−,COR-L23, COR-L23/CPR, COR-L23/5010, COR-L23/R23, COS-7, COV-434, CML T1,CMT, CT26, D17, DH82, DU145, DuCaP, EL4, EM2, EM3, EMT6/AR1,EMT6/AR10.0, FM3, H1299, H69, HB54, HB55, HCA2, HEK-293, HeLa,Hepalc1c7, HL-60, HMEC, HT-29, Jurkat, JY cells, K562 cells, Ku812,KCL22, KG1, KYO1, LNCap, Ma-Mel 1-48, MC-38, MCF-7, MCF-10A, MDA-MB-231,MDA-MB-468, MDA-MB-435, MDCK II, MDCK II, MOR/0.2R, MONO-MAC 6, MTD-1A,MyEnd, NCI-H69/CPR, NCI-H69/LX10, NCI-H69/LX20, NCI-H69/LX4, NIH-3T3,NALM-1, NW-145, OPCN/OPCT cell lines, Peer, PNT-1A/PNT 2, RenCa, RIN-5F,RMA/RMAS, Saos-2 cells, Sf-9, SkBr3, T2, T-47D, T84, THP1 cell line,U373, U87, U937, VCaP, Vero cells, WM39, WT-49, X63, YAC-1, and YAR.

Non-limiting examples of cells that may be engineered or modified withcompositions and methods described herein include immune cells, such asCART, T-cells, B-cells, NK cells, granulocytes, basophils, eosinophils,neutrophils, mast cells, monocytes, macrophages, dendritic cells,antigen-presenting cells (APC), helper T-cells, monocytes, cytotoxicT-cells, suppressor T-cells, or reticulocytes. In some instances, thecell is a hepatocyte. In some instances, the cell is a cardiomyocyte. Insome instances, the cell is a myoblast. In some instances, the cell is abone cell, a muscle cell, a gamete cell, a fat cell or a nerve cell. Insome instances, the cell is an epithelial cell, a gland cell, a Panethcell, a clara cell, exocrine secretory epithelial cell, ahormone-secreting cell, a pituitary cell, a thyroid gland cell, aparathyroid gland cell, a adrenal gland cell, a kidney cell, a livercell, a pancreatic cell, an alpha cell, a beta cell, a delta cell, a PPcell, or an epsilon cell. In some instances, the cell is a keratinizingepithelial cell. In some instances, the cell is a neuron, a sensoryneuron, a motor neuron, an interneuron, a brain neuron. In someinstances, the cell is a photoreceptor cell. In some instances, the cellis a nurse cell, an interstitial cell, a barrier cell, an oral cell. Insome instances, the cell is a enteroendocrine cell. In some instances,the cell is a Paneth cell, or an exocrine secretory epithelial cell. Insome instances, the cell is a keratinocyte a basal cell, a melanocyte, atrichocyte, a intercalated duct cell, a striated duct cell, a duct cell,or an ameloblast. In some cases, the cell is a urinary system cell. Insome instances, the cell is an adipocyte, a white fat cell, a brown fatcell, or both. In some instances, the cell is an extracellular matrixcell. In some instances, a cell is a fibroblast, a chondrocyte, anosteoblast, or an osteocyte. In some instances, the cell is acontractile cell, a skeletal muscle cell, a heart muscle cell, or asmooth muscle cell. In some instances, the cell is a sperm cell or anegg cell.

Non-limiting examples of cells that may be engineered or modified withcompositions and methods described herein include include plant cells,such as parenchyma, sclerenchyma, collenchyma, xylem, phloem, germline(e.g., pollen). Cells from lycophytes, ferns, gymnosperms, angiosperms,bryophytes, charophytes, chloropytes, rhodophytes, or glaucophytes.Non-limiting examples of cells that may be engineered or modified withcompositions and methods described herein include stem cells, such ashuman stem cells, animal stem cells, stem cells that are not derivedfrom human embryonic stem cells, embryonic stem cells, mesenchymal stemcells, pluripotent stem cells, induced pluripotent stem cells (iPS),somatic stem cells, adult stem cells, hematopoietic stem cells,tissue-specific stem cells.

Methods of the disclosure may be performed in a subject. Compositions ofthe disclosure may be administered to a subject. A subject may be ahuman. A subject may be a mammal (e.g., rat, mouse, cow, dog, pig,sheep, horse). A subject may be a vertebrate or an invertebrate. Asubject may be a laboratory animal. A subject may be a patient. Asubject may be suffering from a disease. A subject may display symptomsof a disease. A subject may not display symptoms of a disease, but stillhave a disease. A subject may be under medical care of a caregiver(e.g., the subject is hospitalized and is treated by a physician).Methods of the disclosure may be performed in a plant, bacteria, or afungus.

Methods of the disclosure may be performed in a cell. A cell may be invitro. A cell may be in vivo. A cell may be ex vivo. A cell may be anisolated cell. A cell may be a cell inside of an organism. A cell may bean organism. A cell may be a cell in a cell culture. A cell may be oneof a collection of cells. A cell may be a mammalian cell or derived froma mammalian cell. A cell may be a rodent cell or derived from a rodentcell. A cell may be a human cell or derived from a human cell. A cellmay be a prokaryotic cell or derived from a prokaryotic cell. A cell maybe a bacterial cell or may be derived from a bacterial cell. A cell maybe an archaeal cell or derived from an archaeal cell. A cell may be aeukaryotic cell or derived from a eukaryotic cell. A cell may be apluripotent stem cell. A cell may be an induced pluripotent stem cell(iPSC). A cell may be a plant cell or derived from a plant cell. A cellmay be an animal cell or derived from an animal cell. A cell may be aninvertebrate cell or derived from an invertebrate cell. A cell may be avertebrate cell or derived from a vertebrate cell. A cell may be amicrobe cell or derived from a microbe cell. A cell may be a fungi cellor derived from a fungi cell. A cell may be from a specific organ ortissue. A cell may be a T cell. A cell may be a natural killer T cell(NKT). A cell may be a population of cells. In some cases, a cell can becontacted with a DNA donor template.

Methods of the disclosure may be performed in a eukaryotic cell or cellline. In some instances, the eukaryotic cell is a Chinese hamster ovary(CHO) cell. In some instances, the eukaryotic cell is a Human embryonickidney 293 cells (also referred to as HEK or HEK 293) cell. Non-limitingexamples of cell lines that may be used with compositions, systems andmethods of the present disclosure include C8161, CCRF-CEM, MOLT,mIMCD-3, NHDF, HeLa-S3, Huh1, Huh4, Huh7, HUVEC, HASMC, HEKn, HEKa,MiaPaCell, Panc1, PC-3, TF1, CTLL-2, CIR, Rath, CV1, RPTE, A10, T24,J82, A375, ARH-77, Calu1, SW480, SW620, SKOV3, SK-UT, CaCo2, P388D1,SEM-K2, WEHI-231, HB56, TIB55, Jurkat, J45.01, LRMB, Bcl-1, BC-3, IC21,DLD2, Raw264.7, NRK, NRK-52E, MRC5, MEF, Hep G2, HeLa B, HeLa T4, COS,COS-1, COS-6, COS-M6A, BS-C-1 monkey kidney epithelial, BALB/3T3 mouseembryo fibroblast, 3T3 Swiss, 3T3-L1, 132-d5 human fetal fibroblasts;10.1 mouse fibroblasts, 293-T, 3T3, 721, 9L, A2780, A2780ADR, A2780cis,A172, A20, A253, A431, A-549, ALC, B16, B35, BCP-1 cells, BEAS-2B,bEnd.3, BHK-21, BR 293, BxPC3, C3H-10T1/2, C6/36, Cal-27, CHO, CHO-7,CHO-IR, CHO-K1, CHO-K2, CHO-T, CHO Dhfr −/−, COR-L23, COR-L23/CPR,COR-L23/5010, COR-L23/R23, COS-7, COV-434, CML T1, CMT, CT26, D17, DH82,DU145, DuCaP, EL4, EM2, EM3, EMT6/AR1, EMT6/AR10.0, FM3, H1299, H69,HB54, HB55, HCA2, HEK-293, HeLa, Hepalc1c7, HL-60, HMEC, HT-29, Jurkat,JY cells, K562 cells, Ku812, KCL22, KG1, KYO1, LNCap, Ma-Mel 1-48,MC-38, MCF-7, MCF-10A, MDA-MB-231, MDA-MB-468, MDA-MB-435, MDCK II, MDCKII, MOR/0.2R, MONO-MAC 6, MTD-1A, MyEnd, NCI-H69/CPR, NCI-H69/LX10,NCI-H69/LX20, NCI-H69/LX4, NIH-3T3, NALM-1, NW-145, OPCN/OPCT celllines, Peer, PNT-1A/PNT 2, RenCa, RIN-5F, RMA/RMAS, Saos-2 cells, Sf-9,SkBr3, T2, T-47D, T84, THP1 cell line, U373, U87, U937, VCaP, Verocells, WM39, WT-49, X63, YAC-1, and YAR. Non-limiting examples of othercells that may be used with the disclosure include immune cells, such asCART, T-cells, B-cells, NK cells, granulocytes, basophils, eosinophils,neutrophils, mast cells, monocytes, macrophages, dendritic cells,antigen-presenting cells (APC), or adaptive cells. Non-limiting examplesof cells that may be used with this disclosure also include plant cells,such as Parenchyma, sclerenchyma, collenchyma, xylem, phloem, germline(e.g., pollen). Cells from lycophytes, ferns, gymnosperms, angiosperms,bryophytes, charophytes, chloropytes, rhodophytes, or glaucophytes.Non-limiting examples of cells that may be used with this disclosurealso include stem cells, such as human stem cells, animal stem cells,stem cells that are not derived from human embryonic stem cells,embryonic stem cells, mesenchymal stem cells, pluripotent stem cells,induced pluripotent stem cells (iPS), somatic stem cells, adult stemcells, hematopoietic stem cells, tissue-specific stem cells.

Agricultural Engineering

Compositions and methods of the disclosure may be used for agriculturalengineering. For example, compositions and methods of the disclosure maybe used to confer desired traits on a plant. A plant may be engineeredfor the desired physiological and agronomic characteristic using thepresent disclosure. In some instances, the target nucleic acid sequencecomprises a nucleic acid sequence of a plant. In some instances, thetarget nucleic acid sequence comprises a genomic nucleic acid sequenceof a plant cell. In some instances, the target nucleic acid sequencecomprises a nucleic acid sequence of an organelle of a plant cell. Insome instances, the target nucleic acid sequence comprises a nucleicacid sequence of a chloroplast of a plant cell.

The plant may be a dicotyledonous plant. Non-limiting examples of ordersof dicotyledonous plants include Magniolales, Illiciales, Laurales,Piperales, Aristochiales, Nymphaeales, Ranunculales, Papeverales,Sarraceniaceae, Trochodendrales, Hamamelidales, Eucomiales,Leitneriales, Myricales, Fagales, Casuarinales, Caryophyllales, Batales,Polygonales, Plumbaginales, Dilleniales, Theales, Malvales, Urticales,Lecythidales, Violales, Salicales, Capparales, Ericales, Diapensales,Ebenales, Primulales, Rosales, Fabales, Podostemales, Haloragales,Myrtales, Cornales, Proteales, San tales, Rafflesiales, Celastrales,Euphorbiales, Rhamnales, Sapindales, Juglandales, Geraniales,Polygalales, Umbellales, Gentianales, Polemoniales, Lamiales,Plantaginales, Scrophulariales, Campanulales, Rubiales, Dipsacales, andAsterales.

The plant may be a monocotyledonous plant. Non-limiting examples oforders of monocotyledonous plants include Alismatales, Hydrocharitales,Najadales, Triuridales, Commelinales, Eriocaulales, Restionales, Poales,Juncales, Cyperales, Typhales, Bromeliales, Zingiberales, Arecales,Cyclanthales, Pandanales, Arales, Lilliales, and Orchid ales. A plantmay belong to the order, for example, Gymnospermae, Pinales, Ginkgoales,Cycadales, Araucariales, Cupressales and Gnetales.

Non-limiting examples of plants include plant crops, fruits, vegetables,grains, soy bean, corn, maize, wheat, seeds, tomatoes, rice, cassava,sugarcane, pumpkin, hay, potatoes, cotton, cannabis, tobacco, floweringplants, conifers, gymnosperms, ferns, clubmosses, hornworts, liverworts,mosses, wheat, maize, rice, millet, barley, tomato, apple, pear,strawberry, orange, acacia, carrot, potato, sugar beets, yam, lettuce,spinach, sunflower, rape seed, Arabidopsis, alfalfa, amaranth, apple,apricot, artichoke, ash tree, asparagus, avocado, banana, barley, beans,beet, birch, beech, blackberry, blueberry, broccoli, Brussel's sprouts,cabbage, canola, cantaloupe, carrot, cassava, cauliflower, cedar, acereal, celery, chestnut, cherry, Chinese cabbage, citrus, clementine,clover, coffee, corn, cotton, cowpea, cucumber, cypress, eggplant, elm,endive, eucalyptus, fennel, figs, fir, geranium, grape, grapefruit,groundnuts, ground cherry, gum hemlock, hickory, kale, kiwifruit,kohlrabi, larch, lettuce, leek, lemon, lime, locust, pine, maidenhair,maize, mango, maple, melon, millet, mushroom, mustard, nuts, oak, oats,oil palm, okra, onion, orange, an ornamental plant or flower or tree,papaya, palm, parsley, parsnip, pea, peach, peanut, pear, peat, pepper,persimmon, pigeon pea, pine, pineapple, plantain, plum, pomegranate,potato, pumpkin, radicchio, radish, rapeseed, raspberry, rice, rye,sorghum, safflower, sallow, soybean, spinach, spruce, squash,strawberry, sugar beet, sugarcane, sunflower, sweet potato, sweet corn,tangerine, tea, tobacco, tomato, trees, triticale, turf grasses,turnips, vine, walnut, watercress, watermelon, wheat, yams, yew, andzucchini. A plant may include algae.

XIV. Methods of Treatment

Described herein are methods for treating a disease in a subject bymodifying a target nucleic acid associated with a gene or expression ofa gene related to the disease. In some embodiments, methods compriseadministering a composition or cell described herein to a subject. Byway of non-limiting example, the disease may be a cancer, anophthalmological disorder, a neurological disorder, a neurodegenerativedisease, a blood disorder, or a metabolic disorder, or a combinationthereof. The disease may be an inherited disorder, also referred to as agenetic disorder. The disease may be the result of an infection orassociated with an infection. In some embodiments, the disease comprisesat least one of: a cancer, an inherited disorder, an ophthalmologicaldisorder, neurological disorder, a blood disorder, a metabolic disorder,a genetic disorder, an infection, or any combination thereof. In someembodiments, the disease or disorder comprises one or more of:achondroplasia, Acromegaly, Alagille Syndrome, Alexander Disease,Alzheimer's disease, amebic dysentery, Amyotrophic lateral sclerosis(ALS), Angelman Syndrome, angioedema, antiphospholipid syndrome,babesiosis, balantidial dysentery, brain or spinal injury, cancer,cardiovascular disease and/or lipodystrophies, centronuclear myopathy,Chagas' disease, Charcot Marie Tooth Disease, CNS trauma, coccidiosis,Cri du chat syndrome, Crouzon syndrome, cystic fibrosis, Dercum disease,diabetes, Dravet Syndrome, Emery-Dreifuss syndrome, encephalitis,epilepsy, Factor V Leiden Thrombophilia, Familial Creutzfeld-JakobDisease, Familial Mediterranean Fever, Fanconi anemia, fragile Xsyndrome, Friedreich's ataxia, Gaucher disease, GM2-Gangliosidoses (e.g.Tay Sachs Disease, Sandhoff disease), hearing loss disorders,hemochromatosis, hemophilia, homozygous familial hypercholesterolemia,Huntington's disease, Joubert syndrome, Leber Congenital Amaurosis Type10, Li-Fraumeni syndrome, Lynch syndrome, Marfan syndrome, MECP2Duplication syndrome and Rett syndrome, meningitis, methylmalonicacidemia, migraines, myotonic dystrophy, NAFLD/NASH, neurofibromatosis,non-small cell lung cancer, osteogenesis imperfecta, Parkinson'sdisease, Peutz-Jeghers syndrome, polycystic kidney disease, retinitispigmentosa, sickle cell anemia, spinocerebellar ataxia, stroke and otherhemorrhages, thalassemia, Usher Syndrome, von Hippel-Lindau disease, vonWillebrand disease, Waardenburg syndrome, Zellweger syndrome, or anycombination thereof.

The compositions and methods described herein may be used to treat,prevent, or inhibit a disease or syndrome in a subject. In someembodiments, a syndrome is a group of symptoms which, taken together,characterize a condition. In some embodiments, the disease is a liverdisease, a lung disease, an eye disease, or a muscle disease. Exemplarydiseases and syndromes include, but are not limited to: 11-hydroxylasedeficiency; 17,20-desmolase deficiency; 17-hydroxylase deficiency;3-hydroxyisobutyrate aciduria; 3-hydroxysteroid dehydrogenasedeficiency; 46,XY gonadal dysgenesis; AAA syndrome; ABCA3 deficiency;ABCC8-associated hyperinsulinism; aceruloplasminemia; acromegaly;achondrogenesis type 2; acral peeling skin syndrome; acrodermatitisenteropathica; adrenocortical micronodular hyperplasia;adrenoleukodystrophies; adrenomyeloneuropathies; Aicardi-Goutieressyndrome; Alagille disease (also called Alagille Syndrome); AlexanderDisease, Alpers syndrome; alpha-1 antitrypsin deficiency (AATD);alpha-mannosidosis; Alstrom syndrome; Alzheimer's disease; amebicdysentery; amelogenesis imperfecta; amish type microcephaly; amyotrophiclateral sclerosis (ALS); anaplastic large cell lymphoma; anauxeticdysplasia; androgen insensitivity syndrome; angiopathic thrombosis;antiphospholipid syndrome; Antley-Bixler syndrome; APECED, Apertsyndrome, aplasia of lacrimal and salivary glands, argininemia,arrhythmogenic right ventricular dysplasia, Arts syndrome, ARVD2,arylsulfatase deficiency type metachromatic leokodystrophy, ataxiatelangiectasia, autoimmune lymphoproliferative syndrome; autoimmunepolyglandular syndrome type 1; autosomal dominant anhidrotic ectodermaldysplasia; autosomal dominant deafness; autosomal dominant polycystickidney disease; autosomal recessive microtia; autosomal recessive renalglucosuria; autosomal visceral heterotaxy; babesiosis; balantidialdysentery; Bardet-Biedl syndrome; Bartter syndrome; basal cell nevussyndrome; Batten disease; benign recurrent intrahepatic cholestasis;beta-mannosidosis; β-thalassemia; Bethlem myopathy; Blackfan-Diamondanemia; bleeding disorder (coagulation); blepharophimosis; Bylerdisease; C syndrome; CADASIL; calcific aortic stenosis; calcification ofjoints and arteries; carbamyl phosphate synthetase deficiency;cardiofaciocutaneous syndrome; Carney triad; carnitinepalmitoyltransferase deficiencies; cartilage-hair hypoplasia; cb1C typeof combined methylmalonic aciduria; CD18 deficiency; CD3Z-associatedprimary T-cell immunodeficiency; CD40L deficiency; CDAGS syndrome;CDG1A; CDG1B; CDG1M; CDG2C; CEDNIK syndrome; central core disease;centronuclear myopathy; cerebral capillary malformation;cerebrooculofacioskeletal syndrome type 4; cerebrooculogacioskeletalsyndrome; cerebrotendinous xanthomatosis; Chaga's Disease; Charcot MarieTooth Disesase; cherubism; CHILD syndrome; chronic granulomatousdisease; chronic recurrent multifocal osteomyelitis; citrin deficiency;classic hemochromatosis; CNPPB syndrome; cobalamin C disease; Cockaynesyndrome; coenzyme Q10 deficiency; Coffin-Lowry syndrome; Cohensyndrome; combined deficiency of coagulation factors V; common variableimmune deficiency 3; complement hyperactivation; complete androgeninsentivity; cone rod dystrophies; conformational diseases; congenitalbile adid synthesis defect type 1; congenital bile adid synthesis defecttype 2; congenital defect in bile acid synthesis type; congenitalerythropoietic porphyria; congenital generalized osteosclerosis;Cornelia de Lange syndrome; coronary heart disease; Cousin syndrome;Cowden disease; COX deficiency; Cri du chat syndrome; Crigler-Najjardisease; Crigler-Najjar syndrome type 1; Crisponi syndrome; Crouzonsyndrome; Currarino syndrome; Curth-Macklin type ichthyosis hystrix;cutis laxa; cystic fibrosis; cystinosis; d-2-hydroxyglutaric aciduria;DDP syndrome; Dejerine-Sottas disease; Denys-Drash syndrome; Dercumdisease; desmin cardiomyopathy; desmin myopathy; DGUOK-associatedmitochondrial DNA depletion; diabetes Type I; diabetes Type II;disorders of glutamate metabolism; distal spinal muscular atrophy type5; DNA repair diseases; dominant optic atrophy; Doyne honeycomb retinaldystrophy; Dravet Syndrome; Duchenne muscular dystrophy; dyskeratosiscongenita; Ehlers-Danlos syndrome type 4; Ehlers-Danlos syndromes;Elejalde disease; Ellis-van Creveld disease; Emery-Dreifuss musculardystrophies; encephalomyopathic mtDNA depletion syndrome; encephalitis;enzymatic diseases; EPCAM-associated congenital tufting enteropathy;epidermolysis bullosa with pyloric atresia; epilepsy;facioscapulohumeral muscular dystrophy; Factor V Leiden thrombophilia;Faisalabad histiocytosis; familial atypical mycobacteriosis; familialcapillary malformation-arteriovenous; Familial Creutzfeld-Jakob disease;familial esophageal achalasia; familial glomuvenous malformation;familial hemophagocytic lymphohistiocytosis; familial mediterraneanfever; familial megacalyces; familial schwannomatosis; familial spinabifida; familial splenic asplenia/hypoplasia; familial thromboticthrombocytopenic purpura; Fanconi disease (Fanconi anemia); Feingoldsyndrome; FENIB; fibrodysplasia ossificans progressiva; FKTN; Fragile Xsyndrome; Francois-Neetens fleck corneal dystrophy; Frasier syndrome;Friedreich's ataxia; FTDP-17; Fuchs corneal dystrophy; fucosidosis; G6PDdeficiency; galactosialidosis; Galloway syndrome; Gardner syndrome;Gaucher disease; Gitelman syndrome; GLUT1 deficiency; GM2-Gangliosidoses(e.g., Tay Sachs Disease, Sandhoff Disease) glycogen storage diseasetype 1b; glycogen storage disease type 2; glycogen storage disease type3; glycogen storage disease type 4; glycogen storage disease type 9a;glycogen storage diseases; GM1-gangliosidosis; Greenberg syndrome; Greigcephalopolysyndactyly syndrome; hair genetic diseases; hairy cellleukemia; HANAC syndrome; harlequin type ichtyosis congenita; HDRsyndrome; hearing loss; hemochromatosis type 3; hemochromatosis type 4;hemolytic anemia; hemolytic uremic syndrome; hemophilia A; hemophilia B;hereditary angioedema type 3; hereditary angioedemas; hereditaryhemorrhagic telangiectasia; hereditary hypofibrinogenemia; hereditaryintraosseous vascular malformation; hereditary leiomyomatosis and renalcell cancer; hereditary neuralgic amyotrophy; hereditary sensory andautonomic neuropathy type; Hermansky-Pudlak disease; HHH syndrome; HHT2;hidrotic ectodermal dysplasia type 1; hidrotic ectodermal dysplasias;histiocytic sarcoma; HNF4A-associated hyperinsulinism; HNPCC; homozygousfamilial hypercholesterolemia; human immunodeficiency with microcephaly;human papilloma virus (HPV) infection; Huntington's disease; hyper-IgDsyndrome; hyperinsulinism-hyperammonemia syndrome; hypercholesterolemia;hypertrophy of the retinal pigment epithelium; hypochondrogenesis;hypohidrotic ectodermal dysplasia; ICF syndrome; idiopathic congenitalintestinal pseudo-obstruction; immunodeficiency 13; immunodeficiency 17;immunodeficiency 25; immunodeficiency with hyper-IgM type 1;immunodeficiency with hyper-IgM type 3; immunodeficiency with hyper-IgMtype 4; immunodeficiency with hyper-IgM type 5; immunoglobulin alphadeficiency; inborn errors of thyroid metabolism; infantilemyofibromatosis; infantile visceral myopathy; infantile X-linked spinalmuscular atrophy; intrahepatic cholestasis of pregnancy; IPEX syndrome;IRAK4 deficiency; isolated congenital asplenia; Jeune syndrome;Johanson-Blizzard syndrome; Joubert syndrome; JP-HHT syndrome; juvenilehemochromatosis; juvenile hyalin fibromatosis; juvenilenephronophthisis; Kabuki mask syndrome; Kallmann syndromes; Kartagenersyndrome; KCNJ11-associated hyperinsulinism; Kearns-Sayre syndrome;Kostmann disease; Kozlowski type of spondylometaphyseal dysplasia;Krabbe disease; LADD syndrome; late infantile-onset neuronal ceroidlipofuscinosis; LCK deficiency; LDHCP syndrome; Leber CongenitalAmaurosis Teyp 10; Legius syndrome; Leigh syndrome; lethal congenitalcontracture syndrome 2; lethal congenital contracture syndromes; lethalcontractural syndrome type 3; lethal neonatal CPT deficiency type 2;lethal osteosclerotic bone dysplasia; leukocyte adhesion deficiency; LiFraumeni syndrome; LIG4 syndrome; lipodystrophy; lissencephaly type 1;lissencephaly type 3; Loeys-Dietz syndrome; low phospholipid-associatedcholelithiasis; Lynch Syndrome; lysinuric protein intolerance; alysosomal storage disease (e.g., Hunter syndrome, Hurler syndrome);macular dystrophy; Maffucci syndrome; Majeed syndrome; mannose-bindingprotein deficiency; mantle cell lymphoma; Marfan disease; Marshallsyndrome; MASA syndrome; mastocytosis; MCAD deficiency; McCune-Albrightsyndrome; MCKD2; Meckel syndrome; MECP2 Duplication Syndrome; Meesmanncorneal dystrophy; megacystis-microcolon-intestinal hypoperistalsis;megaloblastic anemia type 1; MEHMO; MELAS; Melnick-Needles syndrome;MEN2s; meningitis; Menkes disease; metachromatic leukodystrophies;methymalonic acidemia due to transcobalamin receptor defect;methylmalonic acidurias; methylvalonic aciduria; microcoria-congenitalnephrosis syndrome; microvillous atrophy; migraine; mitochondrialneurogastrointestinal encephalomyopathy; monilethrix; monosomy X; mosaictrisomy 9 syndrome; Mowat-Wilson syndrome; mucolipidosis type 2;mucolipidosis type Ma; mucolipidosis type IV; mucopolysaccharidoses;mucopolysaccharidosis type 3A; mucopolysaccharidosis type 3C;mucopolysaccharidosis type 4B; multiminicore disease; multiple acyl-CoAdehydrogenation deficiency; multiple cutaneous and mucosal venousmalformations; multiple endocrine neoplasia type 1; multiple sulfatasedeficiency; mycosis fungoides; myotonic dystrophy; NAIC; nail-patellasyndrome; nemaline myopathies; neonatal diabetes mellitus; neonatalsurfactant deficiency; nephronophtisis; Netherton disease;neurofibromatoses; neurofibromatosis type 1; Niemann-Pick disease typeA; Niemann-Pick disease type B; Niemann-Pick disease type C; NKX2E;non-alcoholic fatty liver disease (NAFLD); non-alcoholic steatohepatitis(NASH); Noonan syndrome; North American Indian childhood cirrhosis;NROB1 duplication-associated DSD; ocular genetic diseases;oculo-auricular syndrome; OLEDAID; oligomeganephronia;oligomeganephronic renal hypolasia; Ollier disease; Opitz-Kaveggiasyndrome; orofaciodigital syndrome type 1; orofaciodigital syndrome type2; osseous Paget disease; osteogenesis imperfecta; otopalatodigitalsyndrome type 2; OXPHOS diseases; palmoplantar hyperkeratosis; panlobarnephroblastomatosis; Parkes-Weber syndrome; Parkinson's disease; partialdeletion of 21q22.2-q22.3; Pearson syndrome; Pelizaeus-Merzbacherdisease; Pendred syndrome; pentalogy of Cantrell; peroxisomalacyl-CoA-oxidase deficiency; Peutz-Jeghers syndrome; Pfeiffer syndrome;Pierson syndrome; pigmented nodular adrenocortical disease; pipecolicacidemia; Pitt-Hopkins syndrome; plasmalogens deficiency; plateletglycoprotein IV deficiency; pleuropulmonary blastoma and cysticnephroma; polycystic kidney disease; polycystic ovarian disease;polycystic lipomembranous osteodysplasia; Pompe disease, includinginfantile onset Pompe disease (IOPD) and late onset Pompe disease(LOPD); porphyrias; PRKAG2 cardiac syndrome, premature ovarian failure;primary erythermalgia; primary hemochromatoses; primary hyperoxaluria;progressive familial intrahepatic cholestasis; propionic acidemia;protein-losing enteropathy; pyruvate decarboxylase deficiency;RAPADILINO syndrome; renal cystinosis; retinitis pigmentosa; RettSyndrome; rhabdoid tumor predisposition syndrome; Rieger syndrome; ringchromosome 4; Roberts syndrome; Robinow-Sorauf syndrome;Rothmund-Thomson syndrome; severe combined immunodeficiency disorder(SCID); Saethre-Chotzen syndrome; Sandhoff disease; SC phocomeliasyndrome; SCAS; Schinzel phocomelia syndrome; short rib-polydactylysyndrome type 1; short rib-polydactyly syndrome type 4; short-ribpolydactyly syndrome type 2; short-rib polydactyly syndrome type 3;Shwachman disease; Shwachman-Diamond disease; sickle cell anemia;Silver-Russell syndrome; Simpson-Golabi-Behmel syndrome;Smith-Lemli-Opitz syndrome; SPG7-associated hereditary spasticparaplegia; spherocytosis; spinocerebellar ataxia; split-hand/footmalformation with long bone deficiencies; spondylocostal dysostosis;sporadic visceral myopathy with inclusion bodies; storage diseases;Stargardt macular dystrophy; STRA6-associated syndrome; stroke;Tay-Sachs disease; thanatophoric dysplasia; thyroid metabolism diseases;Tourette syndrome; transthyretin-associated amyloidosis; trisomy 13;trisomy 22; trisomy 2p syndrome; tuberous sclerosis; tuftingenteropathy; urea cycle diseases; Usher Syndrome; Van Den Ende-Guptasyndrome; Van der Woude syndrome; variegated mosaic aneuploidy syndrome;VLCAD deficiency; von Hippel-Lindau disease; von Willebrand disease;Waardenburg syndrome; WAGR syndrome; Walker-Warburg syndrome; Wernersyndrome; Wilson disease; Wiskott-Aldrich Syndrome; Wolcott-Rallisonsyndrome; Wolfram syndrome; X-linked agammaglobulinemia; X-linkedchronic idiopathic intestinal pseudo-obstruction; X-linked cleft palatewith ankyloglossia; X-linked dominant chondrodysplasia punctata;X-linked ectodermal dysplasia; X-linked Emery-Dreifuss musculardystrophy; X-linked lissencephaly; X-linked lymphoproliferative disease;X-linked visceral heterotaxy; xanthinuria type 1; xanthinuria type 2;xeroderma pigmentosum; XPV; and Zellweger disease.

Described herein are compositions and methods for editing or detecting atarget nucleic acid, wherein the target nucleic acid is a gene, aportion thereof, a transcript thereof. In some embodiments, the targetnucleic acid is a reverse transcript (e.g. a cDNA) of an mRNAtranscribed from the gene, or an amplicon thereof. In some embodiments,the target nucleic acid is an amplicon of at least a portion of a gene.Non-limiting examples of genes are: AAVS1, ABCA4, ABCB11, ABCC8, ABCD1,ABCG5, ABCG8, ACAD9, ACADM, ACADVL, ACAT1, ACOX1, ACSF3, ADA, ADAMTS2,ADGRG1, AGA, AGL, AGPS, AGXT, AHI1, AIRE, ALDH3A2, ALDOB, ALG6, ALK,ALKBH5, ALMS1, ALPL, AMRC9, AMT, ANAPC10, ANAPC11, ANGPTL3, APC, Apo(a),APOCIII, APOEε4, APOL1, APP, AQP2, AR, ARFRP1, ARG1, ARH, ARL13B, ARL6,ARSA, ARSB, ASL, ASNS, ASPA, ASS1, ATM, ATP6V1B1, ATP7A, ATP7B, ATRX,ATXN1, ATXN10, ATXN2, ATXN3, ATXN7, ATXN8OS, AXIN1, AXIN2, B2M, BACE-1,BAK1, BAP1, BARD1, BAX2, BBS1, BBS10, BBS12, BBS2, BCKDHA, BCKDHB,BCL2L2, BCS1L, BEST1, Betaglobin gene, BIM BMPR1A, BRAF, BRAFV600E,BRCA1, BRCA2, BRIP1, BSND, C9orf72, CA4, CACNA1A, CAPN3, CASR, CBS,CCNB1, CC2D2A, CCR5, CD1, CD2, CD3, CD3D, CD3Z, CD4, CD5, CD6, CD7,CD8A, CD8B, CD9, CD14, CD18, CD19, CD21, CD22, CD23, CD27, CD28, CD30,CD33, CD34, CD36, CD38, CD40, CD40L, CD44, CD46, CD47, CD48, CD52, CD55,CD57, CD58, CD59, CD68, CD69, CD72, CD73, CD74, CD79A, CD80, CD81, CD83,CD84, CD86, CD90, CD93, CD96, CD99, CD100, CD123, CD160, CD163, CD164,CD164L2, CD166, CD200, CD204, CD207, CD209, CD226, CD244, CD247, CD274,CD276, CD300, CD320, CDC73, CDH1, CDH23, CDK11, CDK4, CDK1N1A, CDK1N1B,CDK1N1C, CDKN2A, CDKN2B, CEBPA, CELA3B, CEP290, CERKL, CFB, CFTR,CHCHD10, CHEK2, CHM, CHRNE, CIITA, CLN3, CLN5, CLN6, CLN8, CLR1N1, CLTA,CNBP, CNGB1, CNGB3, COL1A1, COL1A2, COL27A1, COL4A3, COL4A4, COL4A5,COL7A1, CPS1, CPT1A, CPT2, CRB1, CREBBP, CRX, CRYAA, CTNNA1, CTNNB1,CTNND2, CTNS, CTSK, CXCL12, CYBA, CYBB, CYP11B1, CYP11B2, CYP17A1,CYP19A1, CYP27A1, DBT, DCC, DCLRE1C, DERL2, DFNA36, DFNB31, DGAT2,DHCR7, DHDDS, DICER1, DIS3L2, DLD, DMD, DMPK, DNAH5, DNAI1, DNAI2, DNM2,DNMT1, DPC4, DYSF, EDA, EDN3, EDNRB, EGFR, EIF2B5, EMC2, EMC3, EMD,EMX1, EN1, EPCAM, ERCC6, ERCC8, ESCO2, ETFA, ETFDH, ETHE1, EVC, EVC2,EYS, F5, F9, FXI, FAH, FAM161A, FANCA, FANCB, FANCC, FANCD1, FANCD2,FANCE, FANCF, FANCG, FANCI, FANCJ, FANCL, FANCM, FANCN, FANCP, FANCS,FBN1, FGF14, FGFR2, FGFR3, FGA, FGB, FGG, FH, FHL1, FIX, FKRP, FKTN,FLCN, FMR1, FOXP3, FSCN2, FUS, FUT8, FVIII, FXII, FXN, G6PC, GAA, GALC,GALK1, GALT, GAMT, GATA2, GATA-4, GBA, GBE1, GCDH, GCGR, GDNF, GFAP,GFM1, GHR, GJB1, GJB2, GLA, GLB1, GLDC, GLE1, GNE, GNPTAB, GNPTG, GNS,GPC3, GPR98, GREM1, GRHPR, GRIN2B, H2AFX, H2AX, HADHA, HAX1, HBA1, HBA2,HBB, HER2, HEXA, HEXB, HFE, HGSNAT, HLCS, HMGCL, HOGA1, HOXB13, HPRPF3,HPRT1, HPS1, HPS3, HRAS, HRD1, HSD17B4, HSD3B2, HTT, HUS1, HYAL1, HYLS1,IDS, IDUA, IFITM5, IKBKAP, IL2RG, IL7R, INPP5E, IRF4, ITGB2, ITPR1, IVD,JAG1, JAK1, JAK3, KCNC3, KCND3, KCNJ11, KLHL7, KRAS, LAMA2, LAMA3,LAMB3, LAMC2, LCA5, LDLR, LDLRAP1, LHX3, LIFR, LIPA, LMNA, LOR, LOXHD1,LPL, LRAT, LRP6, LRPPRC, LRRK2, MADR2, MAN2B1, MAPT, MAX MCM6, MCOLN1,MECP2, MED17, MEFV, MEN1, MERTK, MESP2, MET, METex14, MFN2, MFSD8, MIA3,MITF, MKL2, MKS1, MLC1, MLH1, MLH3, MMAA, MMAB, MMACHC, MMADHC, MMD,MPI, MPL, MPV17, MSH2, MSH3, MSH6, MTHFD1L, MTHFR, MTM1, MTRR, MTTP,MUT, MUTYH, MYC, MYH7, MYO7A, NAGL U, NAGS, NBN, NDRG1, NDUFAF5, NDUFS6,NEB, NF1, NF2, NKX2-5, NOG, NOTCH1, NOTCH2, NPC1, NPC2, NPHP1, NPHS1,NPHS2, NRAS, NR2E3, NTHL1, NTRK, NTRK1, OAT, OCT4, OFD1, OPA3, OTC, PAH,PALB2, PAQR8, PAX3, PC, PCCA, PCCB, PCDH15, PCSK9, PD1, PDCD1, PDE6B,PDGFRA, PDHA1, PDHB, PEX1, PEX10, PEX12, PEX13, PEX14, PEX16, PEX19,PEX2, PEX26, PEX3, PEX5, PEX6, PEX7, PFKM, PHGDH, PHOX2B, PKD1, PKD2,PKHD1, PKK, PLEKHG4, PMM2, PMP22, PMS1, PMS2, PNPLA3, POLD1, POLE,POMGNT1, POT1, POU5F1, PPM1A, PPP2R2B, PPT1, PRCD, PRKAG2, PRKAR1A,PRKCG, PRNP, PROM1, PROP1, PRPF31, PRPF8, PRPH2, PRPS1, PSAP, PSD95,PSEN1, PSEN2, PSRC1, PTCH1, PTEN PTS, PUS1, PYGM, RAB23, RAD50, RAD51C,RAD51D, RAG1, RAG2, RAPSN, RARS2, RB1, RDH12, RECQL4, RET, RHO, RICTOR,RMRP, ROS1, RP1, RP2, RPE65, RPGR, RPGRIP1L, RPL32P3, RS1, RTCA, RTEL1,RUNX1, SACS, SAMHD1, SCN1A, SCN2A, SDHA, SDHAF2, SDHB, SDHC, SDHD,SEL1L, SEPSECS, SERPINA1, SERPING1, SGCA, SGCB, SGCG, SGSH, SIRT1,SLC12A3, SLC12A6, SLC17A5, SLC22A5, SLC25A13, SLC25A15, SLC26A2,SLC26A4, SLC35A3, SLC35B4 SLC37A4, SLC39A4, SLC4A11, SLC6A8, SLC7A7,SMAD3, SMAD4, SMARCA4, SMARCAL1, SMARCB1, SMARCE1, SMN1, SMPD1, SNAI2,SNCA, SNRNP200, SOD1, SOX10, SPARA7, SPTBN2, STAR, STAT3, STK11, SUFU,SUMF1, SYNE1, SYNE2, SYS1, TARDBP, TAT, TBK1, TBP, TCIRG1, TCTN3,TECPR2, TERC, TERT, TFR2, TGFBR2, TGM1, TH, TLE3, TMEM127, TMEM138,TMEM216, TMEM43, TMEM67, TMPRSS6, TOP1, TOPORS, TP53, TPP1, TRAC, TRMU,TSC1, TSC2, TSFM, TSPAN14, TTBK2, TTC8, TTPA, TTR, TULP1, TYMP, UBE2G2,UBE2J1, UBE3A, USH1C, USH1G, USH2A, VEGF, VHL, VPS13A, VPS13B, VPS35,VPS45, VRK1, VSX2, VWF, WAS, WDR19, WDR48, WNT10A, WRN, WS2B, WS2C, WT1,XPA, XPC, XPF, XRCC3, YAP1, ZAC1, ZEB1, ZFYVE26, and ZNF423.

In some embodiments, the method for treating a disease comprisesmodifying at least one gene associated with the disease or modifyingexpression of the at least one gene such that the disease is treated. Insome embodiments, the disease is Alzheimer's disease and the gene isselected from APP, BACE-1, PSD95, MAPT, PSEN1, PSEN2, and APOEε4. Insome embodiments, the disease is Parkinson's disease and the gene isselected from SNCA, GDNF, and LRRK2. In some embodiments, the diseasecomprises Centronuclear myopathy and the gene is DNM2. In someembodiments, the disease is Huntington's disease and the gene is HTT. Insome embodiments, the disease is Alpha-1 antitrypsin deficiency (AATD)and the gene is SERPINA1. In some embodiments, the disease isamyotrophic lateral sclerosis (ALS) and the gene is selected from SOD1,FUS, C9ORF72, ATXN2, TARDBP, and CHCHD10. In some embodiments, thedisease comprises Alexander Disease and the gene is GFAP. In someembodiments, the disease comprises anaplastic large cell lymphoma andthe gene is CD30. In some embodiments, the disease comprises AngelmanSyndrome and the gene is UBE3A. In some embodiments, the diseasecomprises calcific aortic stenosis and the gene is Apo(a). In someembodiments, the disease comprises CD3Z-associated primary T-cellimmunodeficiency and the gene is CD3Z or CD247. In some embodiments, thedisease comprises CD18 deficiency and the gene is ITGB2. In someembodiments, the disease comprises CD40L deficiency and the gene isCD40L. In some embodiments, the disease comprises CNS trauma and thegene is VEGF. In some embodiments, the disease comprises coronary heartdisease and the gene is selected from FGA, FGB, and FGG. In someembodiments, the disease comprises MECP2 Duplication syndrome and Rettsyndrome and the gene is MECP2. In some embodiments, the diseasecomprises a bleeding disorder (coagulation) and the gene is FXI. In someembodiments, the disease comprises fragile X syndrome and the gene isFMR1. In some embodiments, the disease comprises Fuchs corneal dystrophyand the gene is selected from ZEB1, SLC4A11, and LOXHD1. In someembodiments, the disease comprises GM2-Gangliosidoses (e.g., Tay SachsDisease, Sandhoff disease) and the gene is selected from HEXA and HEXB.In some embodiments, the disease comprises Hearing loss disorders andthe gene is DFNA36. In some embodiments, the disease is Pompe disease,including infantile onset Pompe disease (IOPD) and late onset Pompedisease (LOPD) and the gene is GAA. In some embodiments, the disease isRetinitis pigmentosa and the gene is selected from PDE6B, RHO, RP1, RP2,RPGR, PRPH2, IIVPDH1, PRPF31, CRB1, PRPF8, TULP1, CA4, HPRPF3, ABCA4,EYS, CERKL, FSCN2, TOPORS, SNRNP200, PRCD, NR2E3, MERTK, USH2A, PROM1,KLHL7, CNGB1, TTC8, ARL6, DHDDS, BEST1, LRAT, SPARA7, CRX, CLRN1, RPE65,and WDR19. In some embodiments, the disease comprises Leber CongenitalAmaurosis Type 10 and the gene is CEP290. In some embodiments, thedisease is cardiovascular disease and/or lipodystrophies and the gene isselected from ABCG5, ABCG8, AGT, ANGPTL3, APOCIII, APOA1, APOL1, ARH,CDKN2B, CFB, CXCL12, FXI, FXII, GATA-4, MIA3, MKL2, MTHFD1L, MYH7,NKX2-5, NOTCH1, PKK, PCSK9, PSRC1, SMAD3, and TTR. In some embodiments,the disease comprises acromegaly and the gene is GHR. In someembodiments, the disease comprises acute myeloid leukemia and the geneis CD22. In some embodiments, the disease is diabetes and the gene isGCGR. In some embodiments, the disease is NAFLD/NASH and the gene isselected from DGAT2 and PNPLA3. In some embodiments, the disease iscancer and the gene is selected from STATS, YAP1, FOXP3, AR (Prostatecancer), and IRF4 (multiple myeloma). In some embodiments, the diseaseis cystic fibrosis and the gene is CFTR. In some embodiments, thedisease is Duchenne muscular dystrophy and the gene is DMD. In someembodiments, the disease comprises angioedema and the gene is PKK. Insome embodiments, the disease comprises thalassemia and the gene isTMPRSS6. In some embodiments, the disease comprises achondroplasia andthe gene is FGFR3. In some embodiments, the disease comprises Cri duchat syndrome and the gene is selected from CTNND2. In some embodiments,the disease comprises sickle cell anemia and the gene is Beta globingene. In some embodiments, the disease comprises Alagille Syndrome andthe gene is selected from JAG1 and NOTCH2. In some embodiments, thedisease comprises Charcot Marie Tooth disease and the gene is selectedfrom PMP22 and MFN2. In some embodiments, the disease comprises Crouzonsyndrome and the gene is selected from FGFR2, FGFR3, and FGFR3. In someembodiments, the disease comprises Dravet Syndrome and the gene isselected from SCN1A and SCN2A. In some embodiments, the diseasecomprises Emery-Dreifuss syndrome and the gene is selected from EMD,LMNA, SYNE1, SYNE2, FHL1, and TMEM43. In some embodiments, the diseasecomprises Factor V Leiden thrombophilia and the gene is F5. In someembodiments, the disease comprises Fanconi anemia and the gene isselected from FANCA, FANCB, FANCC, FANCD1, FANCD2, FANCE, FANCF, FANCG,FANCI, FANCJ, FANCL, FANCM, FANCN, FANCP, FANCS, RAD51C, and XPF. Insome embodiments, the disease comprises Familial Creutzfeld-Jakobdisease and the gene is PRNP. In some embodiments, the disease comprisesFamilial Mediterranean Fever and the gene is MEFV. In some embodiments,the disease comprises Friedreich's ataxia and the gene is FXN. In someembodiments, the disease comprises Gaucher disease and the gene is GBA.In some embodiments, the disease comprises human papilloma virus (HPV)infection and the gene is HPV E7. In some embodiments, the diseasecomprises hemochromatosis and the gene is HFE, optionally comprising aC282Y mutation. In some embodiments, the disease comprises Hemophilia Aand the gene is FVIII. In some embodiments, the disease compriseshistiocytosis and the gene is CD1. In some embodiments, the diseasecomprises immunodeficiency 17 and the gene is CD3D. In some embodiments,the disease comprises immunodeficiency 13 and the gene is CD4. In someembodiments, the disease comprises Common Variable Immunodeficiency andthe gene is selected from CD19 and CD81. In some embodiments, thedisease comprises Joubert syndrome and the gene is selected from INPP5E,TMEM216, AHI1, NPHP1, CEP290, TMEM67, RPGRIP1L, ARL13B, CC2D2A, OFD1,TMEM138, TCTN3, ZNF423, and AMRC9. In some embodiments, the diseasecomprises leukocyte adhesion deficiency and the gene is CD18. In someembodiments, the disease comprises Li-Fraumeni syndrome and the gene isTP53. In some embodiments, the disease comprises lymphoproliferativesyndrome and the gene is CD27. In some embodiments, the diseasecomprises Lynch syndrome and the gene is selected from MSH2, MLH1, MSH6,PMS2, PMS1, TGFBR2, and MLH3. In some embodiments, the disease comprisesmantle cell lymphoma and the gene is CD5. In some embodiments, thedisease comprises Marfan syndrome and the gene is FBN1. In someembodiments, the disease comprises mastocytosis and the gene is CD2. Insome embodiments, the disease comprises methylmalonic acidemia and thegene is selected from MMAA, MMAB, and MUT. In some embodiments, thedisease is mycosis fungoides and the gene is CD7. In some embodiments,the disease is myotonic dystrophy and the gene is selected from CNBP andDMPK. In some embodiments, the disease comprises neurofibromatosis andthe gene is selected from NF1, and NF2. In some embodiments, the diseasecomprises osteogenesis imperfecta and the gene is selected from COL1A1,COL1A2, and IFITM5. In some embodiments, the disease is non-small celllung cancer and the gene is selected from KRAS, EGFR, ALK, METex14, BRAFV600E, ROS1, RET, and NTRK. In some embodiments, the disease comprisesPeutz-Jeghers syndrome and the gene is STK11. In some embodiments, thedisease comprises polycystic kidney disease and the gene is selectedfrom PKD1 and PKD2. In some embodiments, the disease comprises SevereCombined Immune Deficiency and the gene is selected from IL7R, RAG1,JAK3. In some embodiments, the disease comprises PRKAG2 cardiac syndromeand the gene is PRKAG2. In some embodiments, the disease comprisesspinocerebellar ataxia and the gene is selected from ATXN1, ATXN2,ATXN3, PLEKHG4, SPTBN2, CACNA1A, ATXN7, ATXN8OS, ATXN10, TTBK2, PPP2R2B,KCNC3, PRKCG, ITPR1, TBP, KCND3, and FGF14. In some embodiments, thedisease comprises Usher Syndrome and the gene is selected from MYO7A,USH1C, CDH23, PCDH15, USH1G, USH2A, GPR98, DFNB31, and CLRN1. In someembodiments, the disease comprises von Willebrand disease and the geneis VWF. In some embodiments, the disease comprises Waardenburg syndromeand the gene is selected from PAX3, MITF, WS2B, WS2C, SNAI2, EDNRB,EDN3, and SOX10. In some embodiments, the disease comprisesWiskott-Aldrich Syndrome and the gene is WAS. In some embodiments, thedisease comprises von Hippel-Lindau disease and the gene is VHL. In someembodiments, the disease comprises Wilson disease and the gene is ATP7B.In some embodiments, the disease comprises Zellweger syndrome and thegene is selected from PEX1, PEX2, PEX3, PEX5, PEX6, PEX10, PEX12, PEX13,PEX14, PEX16, PEX19, and PEX26. In some embodiments, the diseasecomprises infantile myofibromatosis and the gene is CD34. In someembodiments, the disease comprises platelet glycoprotein IV deficiencyand the gene is CD36. In some embodiments, the disease comprisesimmunodeficiency with hyper-IgM type 3 and the gene is CD40. In someembodiments, the disease comprises hemolytic uremic syndrome and thegene is CD46. In some embodiments, the disease comprises complementhyperactivation, angiopathic thrombosis, or protein-losing enteropathyand the gene is CD55. In some embodiments, the disease compriseshemolytic anemia and the gene is CD59. In some embodiments, the diseasecomprises calcification of joints and arteries and the gene is CD73. Insome embodiments, the disease comprises immunoglobulin alpha deficiencyand the gene is CD79A. In some embodiments, the disease comprises Csyndrome and the gene is CD96. In some embodiments, the diseasecomprises hairy cell leukemia and the gene is CD123. In someembodiments, the disease comprises histiocytic sarcoma and the gene isCD163. In some embodiments, the disease comprises autosomal dominantdeafness and the gene is CD164. In some embodiments, the diseasecomprises immunodeficiency 25 and the gene is CD247. In someembodiments, the disease comprises methymalonic acidemia due totranscobalamin receptor defect and the gene is CD320.

Cancer

In some embodiments, the disease is cancer. In some embodiments, thecancer is a solid cancer (i.e., a tumor). In some embodiments, thecancer is selected from a blood cell cancer, a leukemia, and a lymphoma.The cancer can be a leukemia, such as, by way of non-limiting example,acute myeloid (or myelogenous) leukemia (AML), chronic myeloid (ormyelogenous) leukemia (CML), acute lymphocytic (or lymphoblastic)leukemia (ALL), and chronic lymphocytic leukemia (CLL). In someembodiments, the cancer is any one of colon cancer, rectal cancer,renal-cell carcinoma, liver cancer, bladder cancer, cancer of the kidneyor ureter, lung cancer, non-small cell lung cancer, cancer of the smallintestine, esophageal cancer, melanoma, bone cancer, pancreatic cancer,skin cancer, brain cancer (e.g., glioblastoma), cancer of the head orneck, melanoma, uterine cancer, ovarian cancer, breast cancer,testicular cancer, cervical cancer, stomach cancer, Hodgkin's Disease,non-Hodgkin's lymphoma, and thyroid cancer.

In some embodiments, mutations are associated with cancer or arecausative of cancer. The target nucleic acid, in some embodiments,comprises a portion of a gene comprising a mutation associated withcancer, a gene whose overexpression is associated with cancer, a tumorsuppressor gene, an oncogene, a checkpoint inhibitor gene, a geneassociated with cellular growth, a gene associated with cellularmetabolism, a gene associated with cell cycle, or a combination thereof.Non-limiting examples of genes comprising a mutation associated withcancer are ABL, ACE, AF4/HRX, AKT-2, ALK, ALK/NPM, AML1, AML1/MTG8, APC,ATM, AXIN2, AXL, BAP1, BARD1, BCL-2, BCL-3, BCL-6, BCR/ABL, BLM BMPR1A,BRCA1, BRCA2, BRIP1, c-MYC, CASR, CCR5, CDC73, CDH1, CDK4, CDKN1B,CDKN1C, CDKN2A, CEBPA, CHEK2, CREBBP, CTNNA1, DBL, DEK/CAN DICER1,DIS3L2, E2A/PBX1, EGFR, ENL/HRX, EPCAM, ERG/TLS, ERBB, ERBB-2, ETS-1,EWS/FLI-1, FH, FKRP, FLCN, FMS, FOS, FPS, GATA2, GCG, GLI, GPC3, GPGSP,GREM1, HER2/neu, HOX11, HOXB13, HRAS, HST, IL-3, INT-2, JAK1, JUN, KIT,KS3, K-SAM, LBC, LCK, L-MYC, LYL-1, LYT-10, LYT-10/Ca1, MAS, MAX MDM-2,MEN1, MET, MITF, MLH1, MLL, MOS, MSH1, MSH2, MSH3, MSH6, MTG8/AML1,MUTYH, MYB, MYH11/CBFB, NBN NEU, NF1, NF2, N-MYC, NTHL1, OST, PALB2,PAX-5, PBX1/E2A, PCDC1, PDGFRA, PHOX2B, PMS2, POLD1, POLE, POT1, PPARG,PRAD-1, PRKAR1A, PTCH1, PTEN, RAD50, RAD51C, RAD51D, RAF, RAR/PML,RAS-H, RAS-K, RAS-N, RB1, RECQL4, REL/NRG, RET, RHOM1, RHOM2, ROS,RUNX1, SDHA, SDHAF, SDHAF2, SDHB, SDHC, SDHD, SET/CAN, SIS, SKI, SMAD4,SMARCA4, SMARCB1, SMARCE1, SRC, STK11, SUFU, TAL1, TAL2, TAN-1, TIAM1,TERC, TERT, TIMP3, TMEM127, TNF, TP53, TRAC, TSC1, TSC2, TRK, VHL, WRN,and WT1. Non-limiting examples of oncogenes are KRAS, NRAS, BRAF, MYC,CTNNB1, and EGFR. In some instances, the oncogene is a gene that encodesa cyclin dependent kinase (CDK). Non-limiting examples of CDKs are CDK1,CDK4, CDK5, CDK7, CDK8, CDK9, CDK11 and CDK20. Non-limiting examples oftumor suppressor genes are TP53, RB1, and PTEN.

Infections

Described herein are methods for treating an infection in a subject,wherein the infection is caused by one or more pathogens, parasites, orany combination thereof. Such methods can include modifying a targetnucleic acid associated with the pathogen or parasite causing theinfection. Compositions and methods may modify a target nucleic acidassociated with the pathogen or parasite causing the infection. In someembodiments, the target nucleic acid can be in the pathogen or parasiteitself or in a cell, tissue or organ of the subject that the pathogen orparasite infects. In some embodiments, the pathogen is a bacteria, avirus, a fungus, or any combination thereof. In some embodiments, themethods described herein include treating an infection cause by one ormore bacterial pathogens. Such bacterial pathogens, in some embodiments,comprise, without limitation, Acholeplasma laidlawii, Brucella abortus,Chlamydia psittaci, Chlamydia trachomatis, Cryptococcus neoformans,Escherichia coli, Legionella pneumophila, Lyme disease spirochetes,methicillin-resistant Staphylococcus aureus, Mycobacterium leprae,Mycobacterium tuberculosis, Mycoplasma arginini, Mycoplasma arthritidis,Mycoplasma genitalium, Mycoplasma hyorhinis, Mycoplasma orale,Mycoplasma pneumoniae, Mycoplasma salivarium, Neisseria gonorrhoeae,Neisseria meningitidis, Pneumococcus, Pseudomonas aeruginosa, sexuallytransmitted infection, Streptococcus agalactiae, Streptococcus pyogenes,Treponema pallidum, or any combination thereof.

In some embodiments, the methods described herein include treating aninfection cause by one or more viral pathogens. Such viral pathogens, insome embodiments, comprise, without limitation, adenovirus, blue tonguevirus, chikungunya, coronavirus (e.g. SARS-CoV-2), cytomegalovirus,Dengue virus, Ebola, Epstein-Barr virus, feline leukemia virus,Hemophilus influenzae B, Hepatitis Virus A, Hepatitis Virus B, HepatitisVirus C, herpes simplex virus I, herpes simplex virus II, humanpapillomavirus (HPV), human serum parvo-like virus, human T-cellleukemia viruses, immunodeficiency virus (e.g. HIV), influenza virus,lymphocytic choriomeningitis virus, measles virus, mouse mammary tumorvirus, mumps virus, murine leukemia virus, polio virus, rabies virus,Reovirus, respiratory syncytial virus (RSV), rubella virus, Sendaivirus, simian virus 40, Sindbis virus, varicella-zoster virus, vesicularstomatitis virus, wart virus, West Nile virus, yellow fever virus, orany combination thereof.

In some embodiments, the methods described herein include treating aninfection cause by one or more parasites. Such parasites, in someembodiments comprise, without limitation, helminths, annelids,platyhelminths, nematodes, and thorny-headed worms. In some embodiments,parasitic pathogens comprise, without limitation, Babesia bovis,Echinococcus granulosus, Eimeria tenella, Leishmania tropica,Mesocestoides corti, Onchocerca volvulus, Plasmodium falciparum,Plasmodium vivax, Schistosoma japonicum, Schistosoma mansoni, Taeniahydatigena, Taenia ovis, Taenia saginata, Theileria parva, Toxoplasmagondii, Trichinella spiralis, Trichomonas vaginalis, Trypanosoma brucei,Trypanosoma cruzi, Trypanosoma rangeli, Trypanosoma rhodesiense,Balantidium coli, Entamoeba histolytica, Giardia spp., Isospora spp.,Trichomonas spp., or any combination thereof.

XV. Methods of Modifying Target Nucleic Acids

Disclosed herein are compositions and methods for modifying a targetnucleic acid. The target nucleic acid may be a gene or a portionthereof. Methods and compositions may modify a coding portion of a gene,a non-coding portion of a gene, or a combination thereof. Modifying atleast one gene using the compositions and methods described herein can,in some embodiments, induce a reduction or increase in expression of theone or more genes. In some embodiments, the at least one modified generesults in a reduction in expression, also referred to as genesilencing. In some embodiments, the gene silencing reduces expression ofone or more genes by at least 10%, at least 20%, at least 30%, at least40%, at least 50%, at least 60%, at least 70%, at least 80%, at least90%, or at least 95%. In some embodiments, compositions and methodsremove all expression of a gene, also referred to as genetic knock out.In some embodiments, compositions and methods increase expression of oneor more genes by at least 10%, at least 20%, at least 30%, at least 40%,at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, orat least 100%. In some embodiments, gene silencing is accomplished bytranscriptional silencing, post-transcriptional silencing, or meioticsilencing. In some embodiments, transcriptional silencing is by genomicimprinting, paramutation, transposon silencing, position effect, orRNA-directed DNA methylation. In some embodiments, post-transcriptionalsilencing is by RNA interference, RNA silencing, or nonsense mediateddecay. In some embodiments, meiotic silencing is by transvection ormeiotic silencing of unpaired DNA. In some embodiments, the at least onemodified gene results in removing all expression, also referred to asthe gene being knocked out (KO).

In some embodiments, a gene is modified by repairing or editing amutation as described herein. In some cases, a Cas protein is used toeffect the modification. Cas proteins may be fused to transcriptionactivators or transcriptional repressors or deaminases or other nucleicacid modifying proteins. In some instances, compositions and methods useCas proteins that are fused to a heterologous protein. Heterologousproteins include, but are not limited to, transcriptional activators,transcriptional repressors, deaminases, methyltransferases,acetyltransferases, and other nucleic acid modifying proteins. In somecases, Cas proteins need not be fused to a partner protein to accomplishthe required protein (expression) modification.

In some embodiments, compositions and methods comprise a nucleic acidexpression vector, or use thereof, to introduce a Cas protein, guidenucleic acid, donor template or any combination thereof to a cell. Insome embodiments, the nucleic acid expression vector is a viral vector.Viral vectors include, but are not limited to, retroviruses,adenoviruses, adeno-associated viruses, and herpes simplex viruses. Insome embodiments, the viral vector is a replication-defective viralvector, comprising an insertion of a therapeutic gene inserted in genesessential to the lytic cycle, preventing the virus from replicating andexerting cytotoxic effects. In some embodiments, the viral vector is anadeno associated viral (AAV) vector. In some embodiments, the nucleicacid expression vector is a non-viral vector. In some embodiments,compositions and methods comprise a lipid, polymer, nanoparticle, or acombination thereof, or use thereof, to introduce a Cas protein, guidenucleic acid, donor template or any combination thereof to a cell.Non-limiting examples of lipids and polymers are cationic polymers,cationic lipids, or bio-responsive polymers. In some embodiments, thebio-responsive polymer exploits chemical-physical properties of theendosomal environment (e.g., pH) to preferentially release the geneticmaterial in the intracellular space.

In some embodiments, treatment of a disease comprises administration ofa gene therapy. “Gene therapy”, as used herein, comprises use of arecombinant nucleic acid (DNA or RNA), administered for the purpose toadjust, repair, replace, add, or remove a gene sequence. In someembodiments, a gene therapy comprises use of a vector to introduce afunctional gene or transgene. In some embodiments, vectors comprisenonviral vectors, including cationic polymers, cationic lipids, orbio-responsive polymers. In some embodiments, the bio-responsive polymerexploits chemical-physical properties of the endosomal environment(e.g., pH) to preferentially release the genetic material in theintracellular space. In some embodiments, vectors comprise viralvectors, including retroviruses, adenoviruses, adeno-associated viruses,and herpes simplex viruses. In some embodiments, the vector comprises areplication-defective viral vector, comprising an insertion of atherapeutic gene inserted in genes essential to the lytic cycle,preventing the virus from replicating and exerting cytotoxic effects.Methods of gene therapy are described in more detail in Ingusci et al.,“Gene Therapy Tools for Brain Diseases”, Front. Pharmacol. 10:724 (2019)which is hereby incorporated by reference in its entirety.

It is known that CRISPR-Cas9 gene editing techniques may select forp53-mutated cells. Similarly, the presence of KRAS mutations provides aselective advantage during CRISPR-Cas9 gene editing, as furtherdescribed in Sinha et al., “A systematic genome-wide mapping ofoncogenic mutation selection during CRISPR-Cas9 genome editing”, NatureComm. 12:6512 (2021), which is hereby incorporated by reference in itsentirety. In some embodiments, a genome targeted for treatment comprisesa wild-type p53 gene, a wild-type KRAS gene, a mutated p53 gene, amutated KRAS gene, or any combination thereof. In some embodiments, thegenome comprises ap53 mutation and the target gene comprises WDR48,H2AFX, FANCG, BRIP1, HUS1, XRCC3, PALB2, FANCL, FANCA, FANCC, BRCA1,BRCA2, or any combination thereof. In some embodiments, the genomecomprises a wild-type p53 and the target gene comprises CCNB1, MCM6,ANAPC11, ANAPC10, CDKN1A, or any combination thereof. In someembodiments, the genome comprises a KRAS mutation and the target genecomprises CRYAA, RTCA, LOR, SLC35B4, EN1, CELA3B, NOG, or anycombination thereof.

In some instances, the compositions described herein are for use intherapy. For example, in some instances, the compositions describedherein are for use in treating a disease or condition described herein.

Also provided is the use of the compositions described herein in themanufacture of a medicament. Also provided is the use of thecompositions described herein in the manufacture of a medicament fortherapeutic and/or prophylactic treatment of a disease or conditiondescribed herein.

XVI. Target Nucleic Acids and Samples

Disclosed herein are compositions, systems and methods for detectingand/or modifying a target nucleic acid. In some instances, the targetnucleic acid is a single stranded nucleic acid. Alternatively, or incombination, the target nucleic acid is a double stranded nucleic acidand is prepared into single stranded nucleic acids before or uponcontacting the reagents. In some instances, the target nucleic acid is adouble stranded nucleic acid. In some instances, the double strandednucleic acid is DNA. The target nucleic acid may be a RNA. The targetnucleic acids include but are not limited to mRNA, rRNA, tRNA,non-coding RNA, long non-coding RNA, and microRNA (miRNA). In someinstances, the target nucleic acid is complementary DNA (cDNA)synthesized from a single-stranded RNA template in a reaction catalyzedby a reverse transcriptase. In some cases, the target nucleic acid issingle-stranded RNA (ssRNA) or mRNA. In some cases, the target nucleicacid is from a virus, a parasite, or a bacterium described herein. Asanother non-limiting example, the target nucleic acid may be responsiblefor a disease, contain a mutation (e.g., single strand polymorphism,point mutation, insertion, or deletion), be contained in an amplicon, orbe uniquely identifiable from the surrounding nucleic acids (e.g.,contain a unique sequence of nucleotides).

In certain embodiments, the target nucleic acid is a double strandednucleic acid comprising a target strand and a non-target strand, whereinthe target strand comprises a target sequence. In some embodiments,where a target strand comprises a target sequence, at least a portion ofthe engineered guide nucleic acid is complementary to the targetsequence on the target strand. In some embodiments, where the targetnucleic acid is a double stranded nucleic acid comprising a targetstrand and a non-target strand, and wherein the target strand comprisesa target sequence, at least a portion of the engineered guide nucleicacid is complementary to the target sequence on the target strand. Insome embodiments, a target nucleic acid comprises a PAM as describedherein that is located on the non-target strand. Such a PAM describedherein, in some embodiments, is adjacent (e.g., within 1, 2, 3, 4 or 5nucleotides) to the 5′ end of the target sequence on the non-targetstrand of the double stranded DNA molecule. In certain embodiments, sucha PAM described herein is directly adjacent to the 5′ end of a targetsequence on the non-target strand of the double stranded DNA molecule.

In some cases, an effector protein (e.g., a D2S effector protein) or amultimeric complex thereof recognizes a PAM on a target nucleic acid. Insome cases, multiple effector proteins of the multimeric complexrecognize a PAM on a target nucleic acid. In some cases, only oneeffector protein of the multimeric complex recognizes a PAM on a targetnucleic acid. In some cases, the PAM is 3′ to the spacer region of thecrRNA. In some cases, the PAM is directly 3′ to the spacer region of thecrRNA. In some cases, the PAM sequence comprises a sequence listed inTABLE 6. In some instances, the PAM sequence comprises a sequence listedin TABLE 13. In some instances the PAM sequence comprises a sequencelisted in TABLE 14. In some instances the PAM sequence comprises asequence listed in TABLE 16. In some instances the PAM sequencecomprises a sequence listed in TABLE 17. In some instances, the PAMsequence comprises a sequence listed in TABLE 20. In some instances, thePAM sequence comprises a sequence listed in TABLE 21. In some instances,the PAM sequence comprises a sequence listed in TABLE 23 In someinstances, the PAM sequence comprises a sequence listed in TABLE 24.

A D2S effector protein of the present disclosure, a dimer thereof, or amultimeric complex thereof may cleave or nick a target nucleic acidwithin or near a protospacer adjacent motif (PAM) sequence of the targetnucleic acid. In some instances, cleavage occurs within 1, 2, 3, 4, 5,6, 7, 8, 9 or 10 nucleosides of a 5′ or 3′ terminus of a PAM sequence. Atarget nucleic acid may comprise a PAM sequence adjacent to a sequencethat is complementary to a guide nucleic acid spacer region. In somecases, the PAM sequence is 5′-CTT-3′ (SEQ ID NO: 154). In some cases,the PAM sequence is 5′-CC-3′ (SEQ ID NO: 155). In some cases, the PAMsequence is 5′-TCG-3′ (SEQ ID NO: 156). In some cases, the PAM sequenceis 5′-GCG-3′ (SEQ ID NO: 157). In some cases, the PAM sequence is5′-TTG-3′ (SEQ ID NO: 158). In some cases, the PAM sequence is 5′-GTG-3′(SEQ ID NO: 159). In some cases, the PAM sequence is 5′-ATTA-3′ (SEQ IDNO: 160). In some cases, the PAM sequence is 5′-ATTG-3′ (SEQ ID NO:161). In some cases, the PAM sequence is 5′-GTTA-3′ (SEQ ID NO: 162). Insome cases, the PAM sequence is 5′-GTTG-3′ (SEQ ID NO: 163). In somecases, the PAM sequence is 5′-TC-3′ (SEQ ID NO: 164). In some cases, thePAM sequence is 5′-ACTG-3′ (SEQ ID NO: 165). In some cases, the PAMsequence is 5′-GCTG-3′ (SEQ ID NO: 166). In some cases, the PAM sequenceis 5′-TTC-3′ (SEQ ID NO: 167). In some cases, the PAM sequence is5′-TTT-3′ (SEQ ID NO: 168).

In some cases, the PAM sequence is 5′-G-3′ (SEQ ID NO: 301). In somecases, the PAM sequence is 5′-T-3′ (SEQ ID NO: 302). In some cases, thePAM sequence is 5′-NRNNNNN-3′ (SEQ ID NO: 303). In some cases, the PAMsequence is 5′-NNANRTT-3′ (SEQ ID NO: 304). In some cases, the PAMsequence is 5′-NNKRTTN-3′ (SEQ ID NO: 305). In some cases, the PAMsequence is 5′-NNNCTTN-3′ (SEQ ID NO: 306). In some cases, the PAMsequence is 5′-NNNGNNN-3′ (SEQ ID NO: 307). In some cases, the PAMsequence is 5′-NNNGTYG-3′ (SEQ ID NO: 308). In some cases, the PAMsequence is 5′-NNNGTYN-3′ (SEQ ID NO: 309). In some cases, the PAMsequence is 5′-NNNKNTK-3′ (SEQ ID NO: 310). In some cases, the PAMsequence is 5′-NNNKNTT-3′ (SEQ ID NO: 311). In some cases, the PAMsequence is 5′-NNNNCCN-3′ (SEQ ID NO: 312). In some cases, the PAMsequence is 5′-NNNNCCR-3′ (SEQ ID NO: 313). In some cases, the PAMsequence is 5′-NNNNCTT-3′ (SEQ ID NO: 314). In some cases, the PAMsequence is 5′-CC-3′ (SEQ ID NO: 315). In some cases, the PAM sequenceis 5′-CG-3′ (SEQ ID NO: 316). In some cases, the PAM sequence is5′-CT-3′ (SEQ ID NO: 317). In some cases, the PAM sequence is 5′-TG-3′(SEQ ID NO: 318). In some cases, the PAM sequence is 5′-TN-3′ (SEQ IDNO: 319). In some cases, the PAM sequence is 5′-TY-3′ (SEQ ID NO: 320).In some cases, the PAM sequence is 5′-NNNNNYN-3′ (SEQ ID NO: 321). Insome cases, the PAM sequence is 5′-NNNNNYR-3′ (SEQ ID NO: 322). In somecases, the PAM sequence is 5′-T-3′ (SEQ ID NO: 323). In some cases, thePAM sequence is 5′-NNNNRTT-3′ (SEQ ID NO: 324). In some cases, the PAMsequence is 5′-NNNNTCG-3′ (SEQ ID NO: 325). In some cases, the PAMsequence is 5′-NNNNKCG-3′ (SEQ ID NO: 326). In some cases, the PAMsequence is 5′-NNNNKYG-3′ (SEQ ID NO: 327). In some cases, the PAMsequence is 5′-NNNNTYG-3′ (SEQ ID NO: 328). In some cases, the PAMsequence is 5′-NNNNTNN-3′ (SEQ ID NO: 329). In some cases, the PAMsequence is 5′-NNNNTNY-3′ (SEQ ID NO: 330). In some cases, the PAMsequence is 5′-NNNNTTC-3′ (SEQ ID NO: 331). In some cases, the PAMsequence is 5′-NNNNTTN-3′ (SEQ ID NO: 332). In some cases, the PAMsequence is 5′-NNNNTTY-3′ (SEQ ID NO: 333). In some cases, the PAMsequence is 5′-NNNNTYC-3′ (SEQ ID NO: 334). In some cases, the PAMsequence is 5′-NNNNTYN-3′ (SEQ ID NO: 335). In some cases, the PAMsequence is 5′-NNNNTYR-3′ (SEQ ID NO: 336). In some cases, the PAMsequence is 5′-NNNNYTC-3′ (SEQ ID NO: 337). In some cases, the PAMsequence is 5′-NNNNYTN-3′ (SEQ ID NO: 338). In some cases, the PAMsequence is 5′-NNNNYTY-3′ (SEQ ID NO: 339). In some cases, the PAMsequence is 5′-C-3′ (SEQ ID NO: 340). In some cases, the PAM sequence is5′-NNNRNNG-3′ (SEQ ID NO: 341). In some cases, the PAM sequence is5′-NNNRTNG-3′ (SEQ ID NO: 342). In some cases, the PAM sequence is5′-NNNRTRG-3′ (SEQ ID NO: 343). In some cases, the PAM sequence is5′-NNNRTTG-3′ (SEQ ID NO: 344). In some cases, the PAM sequence is5′-NNNRTTN-3′ (SEQ ID NO: 345). In some cases, the PAM sequence is5′-NNNRTWG-3′ (SEQ ID NO: 346). In some cases, the PAM sequence is5′-NNNTKCG-3′ (SEQ ID NO: 347). In some cases, the PAM sequence is5′-NNNTNCG-3′ (SEQ ID NO: 348). In some cases, the PAM sequence is5′-NNNTNTG-3′ (SEQ ID NO: 349). In some cases, the PAM sequence is5′-NNNTNYN-3′ (SEQ ID NO: 350). In some cases, the PAM sequence is5′-NNNTTCN-3′ (SEQ ID NO: 351). In some cases, the PAM sequence is5′-NNNTTNY-3′ (SEQ ID NO: 352). In some cases, the PAM sequence is5′-NNNTTTN-3′ (SEQ ID NO: 353). In some cases, the PAM sequence is5′-NNNTTYN-3′ (SEQ ID NO: 354). In some cases, the PAM sequence is5′-NNNTYCT-3′ (SEQ ID NO: 355). In some cases, the PAM sequence is5′-NNNTYYN-3′ (SEQ ID NO: 356). In some cases, the PAM sequence is5′-NNNTYYW-3′ (SEQ ID NO: 357). In some cases, the PAM sequence is5′-NNNWNCT-3′ (SEQ ID NO: 358). In some cases, the PAM sequence is5′-NNNTYYT-3′ (SEQ ID NO: 359). In some cases, the PAM sequence is5′-TG-3′ (SEQ ID NO: 360). In some cases, the PAM sequence is5′-NNNWYTG-3′ (SEQ ID NO: 361). In some cases, the PAM sequence is5′-NNNYTTR-3′ (SEQ ID NO: 362). In some cases, the PAM sequence is5′-NNRGTYG-3′ (SEQ ID NO: 363). In some cases, the PAM sequence is5′-NNTNTR-3′ (SEQ ID NO: 364). In some cases, the PAM sequence is5′-NNTTTYN-3′ (SEQ ID NO: 365). In some cases, the PAM sequence is5′-NNWTTYN-3′ (SEQ ID NO: 366). In some cases, the PAM sequence is5′-NNWWTTN-3′ (SEQ ID NO: 367).

In some cases, the PAM sequence is 5′-TNTG-3′ (SEQ ID NO: 368). In somecases, the PAM sequence is 5′-NTCG-3′ (SEQ ID NO: 369). In some cases,the PAM sequence is 5′-RTTR-3′ (SEQ ID NO: 370). In some cases, the PAMsequence is 5′-NTTC-3′ (SEQ ID NO: 371). In some cases, the PAM sequenceis 5′-TCG-3′ (SEQ ID NO: 156). In some cases, the PAM sequence is5′-TTR-3′ (SEQ ID NO: 786). In some cases, the PAM sequence is 5′-TR-3′(SEQ ID NO: 787). In some cases, the PAM sequence is 5′-TTTR-3′ (SEQ IDNO: 788). In some cases, the PAM sequence is 5′-CC-3′ (SEQ ID NO: 155).In some cases, the PAM sequence is 5′-TTTYC-3′ (SEQ ID NO: 789). In somecases, the PAM sequence is 5′-CCN-3′ (SEQ ID NO: 790). In some cases,the PAM sequence is 5′-TG-3′ (SEQ ID NO: 791). In some cases, the PAMsequence is 5′-TNTG-3′ (SEQ ID NO: 368). In some cases, the PAM sequenceis 5′-GGTYG-3′ (SEQ ID NO: 792). In some cases, the PAM sequence is5′-TTTC-3′(SEQ ID NO: 930). In some cases, the PAM sequence is5′-WTTR-3′ (SEQ ID NO: 931).

In some cases, a PAM sequence comprises a sequence in TABLE 39. TABLE 39shows PAM sequences that are associated with different effectorproteins.

TABLE 39 PAM Sequences Associated With Various Effector Proteins EnzymeSEQ ID NO Associated PAMs 1 CTT 4 TTC, TTTC 5 TTY 8 TTC, YTN 9 GNNN 12YTTR, TTYN 13 CTT 14 CC 15 CC 16 CC 18 CC 19 CC 20 CC 21 TC 22 TCG 23TCG, KYG 24 TCG 25 RTTR 26 TCG 28 RTTR 29 RTTG, RTTR 30 TCG, RTTR 31RTTR 32 TCG, KCG 33 KNTK, KNTT 34 RTTR 35 TTC, YTC 36 TTC, TTCN 37TTY, TY 38 TTC, TTCN 39 TYYT, YN, CTTN, T 40 TTC 41 YT, WNCT 42TTC, TTYN, TYYW 43 TTC 44 TTY 45 TTY, TY, TTC 202 RTTN, TCG, RTTR, KRTTN203 CCN, CCR 204 TTYN, WTTYN 205 RTTN 206 TG, TNTG, G 207 RTT, ANRTT 208RTTR, RTWG 209 CCN 210 TTYN, YN, YTTR 212 TTTN 213 GTYG, RGTYG 215 RTRG216 RTNG 217 RTTN 219 RTTR 220 TCG, KCG 221 TG, WNTG 222 RTTR 225 RTRG227 TYN 228 TG, TNTG, WYTG, WNTG, 229 TCG, RTTR 231 CCN, CCR 232TYN, WWTTN, TTTYN 233 TG, TNTG, WNTG 234 TTC, TTNY 236 TCG, RTTR 237RTTR 238 TCG 239 CC 240 TTR, WTTR, RTRG

In some instances, the effector protein comprises an amino acid sequencethat is at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 98%, at least 99%, or 100% identical to SEQID NO: 1, and the target nucleic acid comprises a PAM sequence of CTT(SEQ ID NO: 154). In some instances, the effector protein comprises anamino acid sequence that is at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99%, or100% identical to SEQ ID NO: 13, and the target nucleic acid comprises aPAM sequence of CTT (SEQ ID NO: 154).

In some instances, the effector protein comprises an amino acid sequencethat is at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 98%, at least 99%, or 100% identical to SEQID NO: 15, and the target nucleic acid comprises a PAM sequence of CC(SEQ ID NO: 155). In some instances, the effector protein comprises anamino acid sequence that is at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99%, or100% identical to SEQ ID NO: 22, and the target nucleic acid comprises aPAM sequence of TCG (SEQ ID NO: 156). In some instances, the effectorprotein comprises an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least98%, at least 99%, or 100% identical to SEQ ID NO: 22, and the targetnucleic acid comprises a PAM sequence of GCG (SEQ ID NO: 157).

In some instances, the effector protein comprises an amino acid sequencethat is at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 98%, at least 99%, or 100% identical to SEQID NO: 23, and the target nucleic acid comprises a PAM sequence of TCG(SEQ ID NO: 156). In some instances, the effector protein comprises anamino acid sequence that is at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99%, or100% identical to SEQ ID NO: 23, and the target nucleic acid comprises aPAM sequence of TTG (SEQ ID NO: 158). In some instances, the effectorprotein comprises an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least98%, at least 99%, or 100% identical to SEQ ID NO: 23, and the targetnucleic acid comprises a PAM sequence of GCG (SEQ ID NO: 157). In someinstances, the effector protein comprises an amino acid sequence that isat least 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 95%, at least 98%, at least 99%, or 100% identical to SEQ ID NO:23, and the target nucleic acid comprises a PAM sequence of GTG (SEQ IDNO: 159).

In some instances, the effector protein comprises an amino acid sequencethat is at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 98%, at least 99%, or 100% identical to SEQID NO: 24, and the target nucleic acid comprises a PAM sequence of TCG(SEQ ID NO: 156).

In some instances, the effector protein comprises an amino acid sequencethat is at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 98%, at least 99%, or 100% identical to SEQID NO: 25, and the target nucleic acid comprises a PAM sequence of ATTA(SEQ ID NO: 160). In some instances, the effector protein comprises anamino acid sequence that is at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99%, or100% identical to SEQ ID NO: 25, and the target nucleic acid comprises aPAM sequence of ATTG (SEQ ID NO: 161). In some instances, the effectorprotein comprises an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least98%, at least 99%, or 100% identical to SEQ ID NO: 25, and the targetnucleic acid comprises a PAM sequence of GTTA (SEQ ID NO: 162). In someinstances, the effector protein comprises an amino acid sequence that isat least 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 95%, at least 98%, at least 99%, or 100% identical to SEQ ID NO:25, and the target nucleic acid comprises a PAM sequence of GTTG (SEQ IDNO: 163).

In some instances, the effector protein comprises an amino acid sequencethat is at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 98%, at least 99%, or 100% identical to SEQID NO: 26, and the target nucleic acid comprises a PAM sequence of TCG(SEQ ID NO: 156).

In some instances, the effector protein comprises an amino acid sequencethat is at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 98%, at least 99%, or 100% identical to SEQID NO: 28, and the target nucleic acid comprises a PAM sequence of ATTA(SEQ ID NO: 160). In some instances, the effector protein comprises anamino acid sequence that is at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99%, or100% identical to SEQ ID NO: 28, and the target nucleic acid comprises aPAM sequence of ATTG (SEQ ID NO: 161). In some instances, the effectorprotein comprises an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least98%, at least 99%, or 100% identical to SEQ ID NO: 28, and the targetnucleic acid comprises a PAM sequence of GTTA (SEQ ID NO: 1632). In someinstances, the effector protein comprises an amino acid sequence that isat least 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 95%, at least 98%, at least 99%, or 100% identical to SEQ ID NO:28, and the target nucleic acid comprises a PAM sequence of GTTG (SEQ IDNO: 163).

In some instances, the effector protein comprises an amino acid sequencethat is at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 98%, at least 99%, or 100% identical to SEQID NO: 31, and the target nucleic acid comprises a PAM sequence of ATTA(SEQ ID NO: 160). In some instances, the effector protein comprises anamino acid sequence that is at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99%, or100% identical to SEQ ID NO: 31, and the target nucleic acid comprises aPAM sequence of ATTG (SEQ ID NO: 161). In some instances, the effectorprotein comprises an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least98%, at least 99%, or 100% identical to SEQ ID NO: 31, and the targetnucleic acid comprises a PAM sequence of GTTA (SEQ ID NO: 162). In someinstances, the effector protein comprises an amino acid sequence that isat least 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 95%, at least 98%, at least 99%, or 100% identical to SEQ ID NO:31, and the target nucleic acid comprises a PAM sequence of GTTG (SEQ IDNO: 163).

In some instances, the effector protein comprises an amino acid sequencethat is at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 98%, at least 99%, or 100% identical to SEQID NO: 32, and the target nucleic acid comprises a PAM sequence of TCG(SEQ ID NO: 156). In some instances, the effector protein comprises anamino acid sequence that is at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99%, or100% identical to SEQ ID NO: 32, and the target nucleic acid comprises aPAM sequence of GCG (SEQ ID NO: 157).

In some instances, the effector protein comprises an amino acid sequencethat is at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 98%, at least 99%, or 100% identical to SEQID NO: 21, and the target nucleic acid comprises a PAM sequence of TC(SEQ ID NO: 164).

In some instances, the effector protein comprises an amino acid sequencethat is at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 98%, at least 99%, or 100% identical to SEQID NO: 29, and the target nucleic acid comprises a PAM sequence of ATTG(SEQ ID NO: 161). In some instances, the effector protein comprises anamino acid sequence that is at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99%, or100% identical to SEQ ID NO: 29, and the target nucleic acid comprises aPAM sequence of ACTG (SEQ ID NO: 165). In some instances, the effectorprotein comprises an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least98%, at least 99%, or 100% identical to SEQ ID NO: 29, and the targetnucleic acid comprises a PAM sequence of GTTG (SEQ ID NO: 163). In someinstances, the effector protein comprises an amino acid sequence that isat least 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 95%, at least 98%, at least 99%, or 100% identical to SEQ ID NO:29, and the target nucleic acid comprises a PAM sequence of GCTG (SEQ IDNO: 166).

In some instances, the effector protein comprises an amino acid sequencethat is at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 98%, at least 99%, or 100% identical to SEQID NO: 30, and the target nucleic acid comprises a PAM sequence of TCG(SEQ ID NO: 156).

In some instances, the effector protein comprises an amino acid sequencethat is at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 98%, at least 99%, or 100% identical to SEQID NO: 34, and the target nucleic acid comprises a PAM sequence of ATTA(SEQ ID NO: 160). In some instances, the effector protein comprises anamino acid sequence that is at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99%, or100% identical to SEQ ID NO: 34, and the target nucleic acid comprises aPAM sequence of ATTG (SEQ ID NO: 161). In some instances, the effectorprotein comprises an amino acid sequence that is at least 70%, at least75%, at least 80%, at least 85%, at least 90%, at least 95%, at least98%, at least 99%, or 100% identical to SEQ ID NO: 34, and the targetnucleic acid comprises a PAM sequence of GTTA (SEQ ID NO: 162). In someinstances, the effector protein comprises an amino acid sequence that isat least 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 95%, at least 98%, at least 99%, or 100% identical to SEQ ID NO:34, and the target nucleic acid comprises a PAM sequence of GTTG (SEQ IDNO: 163).

In some instances, the effector protein comprises an amino acid sequencethat is at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 98%, at least 99%, or 100% identical to SEQID NO: 44, and the target nucleic acid comprises a PAM sequence of TTC(SEQ ID NO: 167).

In some instances, the effector protein comprises an amino acid sequencethat is at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 98%, at least 99%, or 100% identical to SEQID NO: 45, and the target nucleic acid comprises a PAM sequence of TTT(SEQ ID NO: 168). In some instances, the effector protein comprises anamino acid sequence that is at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99%, or100% identical to SEQ ID NO: 45, and the target nucleic acid comprises aPAM sequence of TTC (SEQ ID NO: 167).

In some instances, the effector protein comprises an amino acid sequencethat is at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 98%, at least 99%, or 100% identical to SEQID NO: 18, and the target nucleic acid comprises a PAM sequence of CC(SEQ ID NO: 155). In some instances, the effector protein comprises anamino acid sequence that is at least 70%, at least 75%, at least 80%, atleast 85%, at least 90%, at least 95%, at least 98%, at least 99%, or100% identical to SEQ ID NO: 19, and the target nucleic acid comprises aPAM sequence of CC (SEQ ID NO: 155).

In some instances, the effector protein comprises an amino acid sequencethat is at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 98%, at least 99%, or 100% identical to SEQID NO: 43, and the target nucleic acid comprises a PAM sequence of TTC(SEQ ID NO: 167).

In some cases, the target nucleic acid comprises 5 to 100, 5 to 90, 5 to80, 5 to 70, 5 to 60, 5 to 50, 5 to 40, 5 to 30, 5 to 25, 5 to 20, 5 to15, or 5 to 10 linked nucleosides. In some cases, the target nucleicacid comprises 10 to 90, 20 to 80, 30 to 70, or 40 to 60 linkednucleosides. In some cases, the target nucleic acid comprises 5, 6, 7,8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 45, 50, 60,70, 80, 90, or 100 linked nucleosides. In some instances, the targetnucleic acid comprises at least 10, at least 20, at least 30, at least40, at least 50, at least 60, at least 70, at least 80, at least 90, orat least 100 linked nucleosides.

In some cases, the target nucleic acid is AAVS1, ABCA4, ABCB11, ABCC8,ABCD1, ACAD9, ACADM, ACADVL, ACAT1, ACOX1, ACSF3, ADA, ADAMTS2, ADGRG1,AGA, AGL, AGPS, AGXT, AHI1, AIRE, ALDH3A2, ALDOB, ALG6, ALK, ALKBH5,ALMS1, ALPL, AMRC9, AMT, ANGPTL3, APC, Apo(a), APOCIII, APOEc4, APOL1,APP, AQP2, AR, ARFRP1, ARG1, ARL13B, ARL6, ARSA, ARSB, ASL, ASNS, ASPA,ASS1, ATM, ATP6V1B1, ATP7A, ATP7B, ATRX, ATXN1, ATXN10, ATXN2, ATXN3,ATXN7, ATXN8OS, AXIN1, AXIN2, B2M, BACE-1, BAK1, BAP1, BARD1, BAX2,BBS1, BBS10, BBS12, BBS2, BCKDHA, BCKDHB, BCL2L2, BCS1L, BEST1,Betaglobin gene, BLM, BMPR1A, BRAFV600E, BRCA1, BRCA2, BRIP1, BSND,C282Y, C9orf72, CA4, CACNA1A, CAPN3, CASR, CBS, CC2D2A, CCR5, CDC73,CDH1, CDH23, CDK11, CDK4, CDKN1B, CDKN1C, CDKN2A, CEBPA, CEP290, CERKL,CFTR, CHCHD10, CHEK2, CHM, CHRNE, CIITA, CLN3, CLN5, CLN6, CLN8, CLRN1,CLTA, CNBP, CNGB1, CNGB3, COL1A1, COL1A2, COL27A1, COL4A3, COL4A4,COL4A5, COL7A1, CPS1, CPT1A, CPT2, CRB1, CRX, CTNNA1, CTNNB1, CTNND2,CTNS, CTSK, CYBA, CYBB, CYP11B1, CYP11B2, CYP17A1, CYP19A1, CYP27A1,DBT, DCLRE1C, DERL2, DFNA36, DFNB31, DGAT2, DHCR7, DHDDS, DICER1,DIS3L2, DLD, DMD, DMPK, DNAH5, DNAI1, DNAI2, DNM2, DNMT1, DYSF, EDA,EDN3, EDNRB, EGFR, EIF2B5, EMC2, EMC3, EMD, EMX1, EPCAM, ERCC6, ERCC8,ESCO2, ETFA, ETFDH, ETHE1, EVC, EVC2, EYS, F5, F9, FactorB, FactorXI,FAH, FAM161A, FANCA, FANCB, FANCC, FANCD1, FANCD2, FANCE, FANCF, FANCG,FANCI, FANCJ, FANCL, FANCM, FANCN, FANCP, FANCS, FBN1, FGF14, FGFR2,FGFR3, FH, FHL1, FKRP, FKTN, FLCN, FMR1, FOXP3, FSCN2, FUS, FUT8, FVIII,FXII, FXN, G6PC, GAA, GALC, GALK1, GALT, GAMT, GATA2, GBA, GBE1, GCDH,GCGR, GDNF, GFAP, GFM1, GHR, GJB1, GJB2, GLA, GLB1, GLDC, GLE1, GNE,GNPTAB, GNPTG, GNS, GPC3, GPR98, GREM1, GRHPR, GRIN2B, H2AX, HADHA,HAX1, HBA1, HBA2, HBB, HEXA, HEXB, HGSNAT, HLCS, HMGCL, HOGA1, HOXB13,HPRPF3, HPRT1, HPS1, HPS3, HRAS, HSD17B4, HSD3B2, HTT, HYAL1, HYLS1,IDS, IDUA, IFITM5, IKBKAP, IL2RG, IMPDH1, INPP5E, IRF4, ITPR1, IVD,JAG1, KCNC3, KCND3, KCNJ11, KLHL7, KRAS, LAMA2, LAMA3, LAMB3, LAMC2,LCA5, LDLR, LDLRAP1, LHX3, LIFR, LIPA, LMNA, LOXHD1, LPL, LRAT, LRP6,LRPPRC, LRRK2, MAN2B1, MAPT, MAX, MCOLN1, MECP2, MED17, MEFV, MEN1,MERTK, MESP2, MET, METex14, MFN2, MFSD8, MITF, MKS1, MLC1, MLH1, MLH3,MMAA, MMAB, MMACHC, MMADHC, MMD, MPI, MPL, MPV17, MSH2, MSH3, MSH6,MTHFR, MTM1, MTRR, MTTP, MUT, MUTYH, MYO7A, NAGLU, NAGS, NBN, NDRG1,NDUFAF5, NDUFS6, NEB, NF1, NF2, NOTCH2, NPC1, NPC2, NPHP1, NPHS1, NPHS2,NR2E3, NTHL1, NTRK, NTRK1, OAT, OCT4, OFD1, OPA3, OTC, PAH, PALB2,PAQR8, PAX3, PC, PCCA, PCCB, PCDH15, PCSK9, PD1, PDCD1, PDE6B, PDGFRA,PDHA1, PDHB, PEX1, PEX10, PEX12, PEX13, PEX14, PEX16, PEX19, PEX2,PEX26, PEX3, PEX5, PEX6, PEX7, PFKM, PHGDH, PHOX2B, PKD1, PKD2, PKHD1,PKK, PLEKHG4, PMM2, PMP22, PMS1, PMS2, PNPLA3, POLD1, POLE, POMGNT1,POT1, POU5F1, PPM1A, PPP2R2B, PPT1, PRCD, PRKAR1A, PRKCG, PRNP, PROM1,PROP1, PRPF31, PRPF8, PRPH2, PRPS1, PSAP, PSD95, PSEN1, PSEN2, PTCH1,PTEN, PTS, PUS1, PYGM, RAB23, RAD50, RAD51C, RAD51D, RAG2, RAPSN, RARS2,RB1, RDH12, RECQL4, RET, RHO, RICTOR, RMRP, ROS1, RP1, RP2, RPE65, RPGR,RPGRIP1L, RPL32P3, RS1, RTEL1, RUNX1, SACS, SAMHD1, SCN1A, SCN2A, SDHA,SDHAF2, SDHB, SDHC, SDHD, SEL1L, SEPSECS, SERPING1, SGCA, SGCB, SGCG,SGSH, SIRT1, SLC12A3, SLC12A6, SLC17A5, SLC22A5, SLC25A13, SLC25A15,SLC26A2, SLC26A4, SLC35A3, SLC37A4, SLC39A4, SLC4A11, SLC6A8, SLC7A7,SMAD4, SMARCA4, SMARCAL1, SMARCB1, SMARCE1, SMN1, SMPD1, SNAI2, SNCA,SNRNP200, SOD1, SOX10, SPARA7, SPTBN2, STAR, STAT3, STK11, SUFU, SUMF1,SYNE1, SYNE2, SYS1, TARDBP, TAT, TBK1, TBP, TCIRG1, TCTN3, TECPR2, TERC,TERT, TFR2, TGFBR2, TGM1, TH, TLE3, TMEM127, TMEM138, TMEM216, TMEM43,TMEM67, TMPRSS6, TOP1, TOPORS, TP53, TPP1, TRAC, TRMU, TSFM, TSPAN14,TTBK2, TTC8, TTPA, TTR, TULP1, TYMP, UBE2G2, UBE2J1, UBE3A, USH1C,USH1G, USH2A, VEGF, VHL, VPS13A, VPS13B, VPS35, VPS45, VRK1, VSX2, VWF,WDR19, WNT10A, WS2B, WS2C, XPA, XPC, XPF, YAP1, ZFYVE26, or ZNF423.

In some cases, the target nucleic acid is selected from the targetnucleic acids listed in Table 4.

TABLE 4 EXEMPLARY TARGET NUCLEIC ACIDS Exemplary target nucleic acidsDNMT1, HPRT1, RPL32P3, CCR5, FANCF, GRIN2B, EMX1 AAVS1, ALKBH5, CLTA,CDK11, CTNNB1, AXIN1, LRP6, TBK1, BAP1, TLE3, PPM1A, BCL2L2, SUFU,RICTOR, VPS35, TOP1, SIRT1, PTEN MMD, PAQR8, H2AX, POU5F1, OCT4 B2M,TRAC, or CIITA, or NGCG_B2M SYS1, ARFRP1, and TSPAN14 EMC2, EMC3, SEL1L,DERL2, UBE2G2, UBE2J1, and HRD1

In some cases, the target nucleic acid comprises a target locus. Incertain embodiments, the target nucleic acid comprises more than onetarget loci.

In some cases, the target nucleic acid is B2M. In some cases, the B2Mtarget nucleic acid comprises one or more target loci. In some cases,the B2M target nucleic acid comprises two target loci. In some cases,the B2M target locus comprises B2M2 or B2M4.

In some cases, the target nucleic acid is B2M, IRAC, or CIITA, NGCG_B2M,or any combination thereof. In some cases, the B2M, IRAC, or CIITA, orNGCG_B2M target nucleic acid comprises one or more target loci. In somecases, the B2M, IRAC, or CIITA, or NGCG_B2M target nucleic acidcomprises two target loci.

A D2S effector protein-guide nucleic acid complex may comprise highselectivity for a target sequence. In some cases, a ribonucleoproteinmay comprise a selectivity of at least 200:1, 100:1, 50:1, 20:1, 10:1,or 5:1 for a target nucleic acid over a single nucleotide variant of thetarget nucleic acid. In some cases, a ribonucleoprotein may comprise aselectivity of at least 5:1 for a target nucleic acid over a singlenucleotide variant of the target nucleic acid. Leveraging D2S effectorprotein selectivity, some methods described herein may detect a targetnucleic acid present in the sample in various concentrations or amountsas a target nucleic acid population. In some cases, the sample has atleast 2 target nucleic acids. In some cases, the sample has at least 3,5, 10, 20, 30, 40, 50, 100, 200, 300, 400, 500, 600, 700, 800, 900,1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, or 10000 targetnucleic acids. In some cases, the sample comprises 1 to 10,000, 100 to8000, 400 to 6000, 500 to 5000, 1000 to 4000, or 2000 to 3000 targetnucleic acids. In some cases, the method detects target nucleic acidpresent at least at one copy per 10 non-target nucleic acids, 10²non-target nucleic acids, 10³ non-target nucleic acids, 10⁴ non-targetnucleic acids, 10⁵ non-target nucleic acids, 10⁶ non-target nucleicacids, 10⁷ non-target nucleic acids, 10⁸ non-target nucleic acids, 10⁹non-target nucleic acids, or 10¹⁰ non-target nucleic acids.

Often, the target nucleic acid may be from 0.05% to 20% of total nucleicacids in the sample. Sometimes, the target nucleic acid is 0.1% to 10%of the total nucleic acids in the sample. The target nucleic acid, insome cases, is 0.1% to 5% of the total nucleic acids in the sample. Thetarget nucleic acid may also be 0.1% to 1% of the total nucleic acids inthe sample. The target nucleic acid may be DNA or RNA. The targetnucleic acid may be any amount less than 100% of the total nucleic acidsin the sample. The target nucleic acid may be 100% of the total nucleicacids in the sample.

The target nucleic acid may be 0.05% to 20% of total nucleic acids inthe sample. Sometimes, the target nucleic acid is 0.1% to 10% of thetotal nucleic acids in the sample. The target nucleic acid, in somecases, is 0.1% to 5% of the total nucleic acids in the sample. Often, asample comprises the segment of the target nucleic acid and at least onenucleic acid comprising less than 100% sequence identity to the segmentof the target nucleic acid but no less than 50% sequence identity to thesegment of the target nucleic acid. For example, the segment of thetarget nucleic acid comprises a mutation as compared to at least onenucleic acid comprising less than 100% sequence identity to the segmentof the target nucleic acid but no less than 50% sequence identity to thesegment of the target nucleic acid. Often, the segment of the targetnucleic acid comprises a single nucleotide mutation as compared to atleast one nucleic acid comprising less than 100% sequence identity tothe segment of the target nucleic acid but no less than 50% sequenceidentity to the segment of the target nucleic acid.

A target nucleic acid may be an amplified nucleic acid of interest. Thenucleic acid of interest may be any nucleic acid disclosed herein orfrom any sample as disclosed herein. The nucleic acid of interest may bean RNA that is reverse transcribed before amplification. The nucleicacid of interest may be amplified then the amplicons may be transcribedinto RNA.

In some instances, compositions described herein exhibit indiscriminatetrans-cleavage of ssRNA, enabling their use for detection of RNA insamples. In some cases, target ssRNA are generated from many nucleicacid templates (RNA) in order to achieve cleavage of the FQ reporter inthe DETECTR platform. Certain D2S effector proteins may be activated byssRNA, upon which they may exhibit trans-cleavage of ssRNA and may,thereby, be used to cleave ssRNA FQ reporter molecules in the DETECTRsystem. These D2S effector proteins may target ssRNA present in thesample or ssRNA generated and/or amplified from any number of nucleicacid templates (RNA). Described herein are reagents comprising a singlestranded reporter nucleic acid comprising a detection moiety, whereinthe reporter nucleic acid (e.g., the ssDNA-FQ reporter described above)is capable of being cleaved by the D2S effector protein, upon generationand amplification of ssRNA from a nucleic acid template using themethods disclosed herein, thereby generating a first detectable signal.

In some instances, target nucleic acids comprise at least one nucleicacid comprising at least 50% sequence identity to the target nucleicacid or a portion thereof. Sometimes, the at least one nucleic acidcomprises an amino acid sequence that is at least 60%, 70%, 75%, 80%,85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to anequal length portion of the target nucleic acid. Sometimes, the at leastone nucleic acid comprises an amino acid sequence that is 100% identicalto an equal length portion of the target nucleic acid. Sometimes, theamino acid sequence of the at least one nucleic acid is at least 60%,70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%identical to the target nucleic acid. Sometimes, the target nucleic acidcomprises an amino acid sequence that is less than 60%, 70%, 75%, 80%,85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to anequal length portion of the at least one nucleic acid.

In some instances, samples comprise a target nucleic acid at aconcentration of less than 1 nM, less than 2 nM, less than 3 nM, lessthan 4 nM, less than 5 nM, less than 6 nM, less than 7 nM, less than 8nM, less than 9 nM, less than 10 nM, less than 20 nM, less than 30 nM,less than 40 nM, less than 50 nM, less than 60 nM, less than 70 nM, lessthan 80 nM, less than 90 nM, less than 100 nM, less than 200 nM, lessthan 300 nM, less than 400 nM, less than 500 nM, less than 600 nM, lessthan 700 nM, less than 800 nM, less than 900 nM, less than 1 μM, lessthan 2 μM, less than 3 μM, less than 4 μM, less than 5 μM, less than 6μM, less than 7 μM, less than 8 μM, less than 9 μM, less than 10 μM,less than 100 μM, or less than 1 mM. In some instances, the samplecomprises a target nucleic acid sequence at a concentration of 1 nM to 2nM, 2 nM to 3 nM, 3 nM to 4 nM, 4 nM to 5 nM, 5 nM to 6 nM, 6 nM to 7nM, 7 nM to 8 nM, 8 nM to 9 nM, 9 nM to 10 nM, 10 nM to 20 nM, 20 nM to30 nM, 30 nM to 40 nM, 40 nM to 50 nM, 50 nM to 60 nM, 60 nM to 70 nM,70 nM to 80 nM, 80 nM to 90 nM, 90 nM to 100 nM, 100 nM to 200 nM, 200nM to 300 nM, 300 nM to 400 nM, 400 nM to 500 nM, 500 nM to 600 nM, 600nM to 700 nM, 700 nM to 800 nM, 800 nM to 900 nM, 900 nM to 1 μM, 1 μMto 2 μM, 2 μM to 3 μM, 3 μM to 4 μM, 4 μM to 5 μM, 5 μM to 6 μM, 6 μM to7 μM, 7 μM to 8 μM, 8 μM to 9 μM, 9 μM to 10 μM, 10 μM to 100 μM, 100 μMto 1 mM, 1 nM to 10 nM, 1 nM to 100 nM, 1 nM to 1 μM, 1 nM to 10 μM, 1nM to 100 μM, 1 nM to 1 mM, 10 nM to 100 nM, 10 nM to 1 μM, 10 nM to 10μM, 10 nM to 100 μM, 10 nM to 1 mM, 100 nM to 1 μM, 100 nM to 10 μM, 100nM to 100 μM, 100 nM to 1 mM, 1 μM to 10 μM, 1 μM to 100 μM, 1 μM to 1mM, 10 μM to 100 μM, 10 μM to 1 mM, or 100 μM to 1 mM. In someinstances, the sample comprises a target nucleic acid at a concentrationof 20 nM to 200 μM, 50 nM to 100 μM, 200 nM to 50 μM, 500 nM to 20 μM,or 2 μM to 10 μM. In some instances, the target nucleic acid is notpresent in the sample.

In some instances, samples comprise fewer than 10 copies, fewer than 100copies, fewer than 1000 copies, fewer than 10,000 copies, fewer than100,000 copies, or fewer than 1,000,000 copies of a target nucleic acidsequence. In some instances, the sample comprises 10 copies to 100copies, 100 copies to 1000 copies, 1000 copies to 10,000 copies, 10,000copies to 100,000 copies, 100,000 copies to 1,000,000 copies, 10 copiesto 1000 copies, 10 copies to 10,000 copies, 10 copies to 100,000 copies,10 copies to 1,000,000 copies, 100 copies to 10,000 copies, 100 copiesto 100,000 copies, 100 copies to 1,000,000 copies, 1,000 copies to100,000 copies, or 1,000 copies to 1,000,000 copies of a target nucleicacid sequence. In some instances, the sample comprises 10 copies to500,000 copies, 200 copies to 200,000 copies, 500 copies to 100,000copies, 1000 copies to 50,000 copies, 2000 copies to 20,000 copies, 3000copies to 10,000 copies, or 4000 copies to 8000 copies. In someinstances, the target nucleic acid is not present in the sample.

A number of target nucleic acid populations are consistent with themethods and compositions disclosed herein. Some methods described hereinmay detect two or more target nucleic acid populations present in thesample in various concentrations or amounts. In some cases, the samplehas at least 2 target nucleic acid populations. In some cases, thesample has at least 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, or 50 targetnucleic acid populations. In some cases, the sample has 3 to 50, 5 to40, or 10 to 25 target nucleic acid populations. In some cases, themethod detects target nucleic acid populations that are present at leastat one copy per 10¹ non-target nucleic acids, 10² non-target nucleicacids, 10³ non-target nucleic acids, 10⁴ non-target nucleic acids, 10⁵non-target nucleic acids, 10⁶ non-target nucleic acids, 10⁷ non-targetnucleic acids, 10⁸ non-target nucleic acids, 10⁹ non-target nucleicacids, or 10¹⁰ non-target nucleic acids. The target nucleic acidpopulations may be present at different concentrations or amounts in thesample.

In some instances, target nucleic acids may activate a D2S effectorprotein to initiate sequence-independent cleavage of a nucleicacid-based reporter (e.g., a reporter comprising an RNA sequence, or areporter comprising DNA and RNA). For example, a D2S effector protein ofthe present disclosure is activated by a target nucleic acid to cleavereporters having an RNA (also referred to herein as an “RNA reporter”).Alternatively, a D2S effector protein of the present disclosure isactivated by a target nucleic acid to cleave reporters having an RNA.Alternatively, a D2S effector protein of the present disclosure isactivated by a target RNA to cleave reporters having an RNA (alsoreferred to herein as a “RNA reporter”). The RNA reporter may comprise asingle-stranded RNA labelled with a detection moiety or may be any RNAreporter as disclosed herein.

In some instances, the target nucleic acid as described in the methodsherein does not initially comprise a PAM sequence. However, any targetnucleic acid of interest may be generated using the methods describedherein to comprise a PAM sequence, and thus be a PAM target nucleicacid. A PAM target nucleic acid, as used herein, refers to a targetnucleic acid that has been amplified to insert a PAM sequence that isrecognized by a D2S effector system.

In some instances, the target nucleic acid is in a cell. In someinstances, the cell is a single-cell eukaryotic organism; a plant cellan algal cell; a fungal cell; an animal cell; a cell an invertebrateanimal; a cell a vertebrate animal such as fish, amphibian, reptile,bird, and mammal; or a cell a mammal such as a human, a non-humanprimate, an ungulate, a feline, a bovine, an ovine, and a caprine. Inpreferred embodiments, the cell is a eukaryotic cell. In preferredembodiments, the cell is a mammalian cell, a human cell, or a plantcell.

In some instances, the target nucleic acid comprises a nucleic acidsequence from a pathogen responsible for a disease. Non-limitingexamples of pathogens are bacteria, a virus and a fungus. The targetnucleic acid, in some cases, is a portion of a nucleic acid from asexually transmitted infection or a contagious disease. In some cases,the target nucleic acid is a portion of a nucleic acid from a genomiclocus, or any DNA amplicon, such as a reverse transcribed mRNA or a cDNAfrom a gene locus, a transcribed mRNA, or a reverse transcribed cDNAfrom a gene locus in at least one of: human immunodeficiency virus(HIV), human papillomavirus (HPV), chlamydia, gonorrhea, syphilis,trichomoniasis, sexually transmitted infection, malaria, Dengue fever,Ebola, chikungunya, and leishmaniasis. Pathogens include viruses, fungi,helminths, protozoa, malarial parasites, Plasmodium parasites,Toxoplasma parasites, and Schistosoma parasites. Helminths includeroundworms, heartworms, and phytophagous nematodes, flukes,Acanthocephala, and tapeworms. Protozoan infections include infectionsfrom Giardia spp., Trichomonas spp., African trypanosomiasis, amoebicdysentery, babesiosis, balantidial dysentery, Chaga's disease,coccidiosis, malaria and toxoplasmosis. Examples of pathogens such asparasitic/protozoan pathogens include, but are not limited to:Plasmodium falciparum, P. vivax, Trypanosoma cruzi and Toxoplasmagondii. Fungal pathogens include, but are not limited to Cryptococcusneoformans, Histoplasma capsulatum, Coccidioides immitis, Blastomycesdermatitidis, Chlamydia trachomatis, and Candida albicans. Pathogenicviruses include but are not limited to coronavirus (e.g., SARS-CoV-2);immunodeficiency virus (e.g., HIV); influenza virus; dengue; West Nilevirus; herpes virus; yellow fever virus; Hepatitis Virus C; HepatitisVirus A; Hepatitis Virus B; papillomavirus; and the like. Pathogensinclude, e.g., HIV virus, Mycobacterium tuberculosis, Streptococcusagalactiae, methicillin-resistant Staphylococcus aureus, Legionellapneumophila, Streptococcus pyogenes, Escherichia coli, Neisseriagonorrhoeae, Neisseria meningitidis, Pneumococcus, Cryptococcusneoformans, Histoplasma capsulatum, Hemophilus influenzae B, Treponemapallidum, Lyme disease spirochetes, Pseudomonas aeruginosa,Mycobacterium leprae, Brucella abortus, rabies virus, influenza virus,cytomegalovirus, herpes simplex virus I, herpes simplex virus II, humanserum parvo-like virus, respiratory syncytial virus (RSV), M.genitalium, T. vaginalis, varicella-zoster virus, hepatitis B virus,hepatitis C virus, measles virus, adenovirus, human T-cell leukemiaviruses, Epstein-Barr virus, murine leukemia virus, mumps virus,vesicular stomatitis virus, Sindbis virus, lymphocytic choriomeningitisvirus, wart virus, blue tongue virus, Sendai virus, feline leukemiavirus, Reovirus, polio virus, simian virus 40, mouse mammary tumorvirus, dengue virus, rubella virus, West Nile virus, Plasmodiumfalciparum, Plasmodium vivax, Toxoplasma gondii, Trypanosoma rangeli,Trypanosoma cruzi, Trypanosoma rhodesiense, Trypanosoma brucei,Schistosoma mansoni, Schistosoma japonicum, Babesia bovis, Eimeriatenella, Onchocerca volvulus, Leishmania tropica, Mycobacteriumtuberculosis, Trichinella spiralis, Theileria parva, Taenia hydatigena,Taenia ovis, Taenia saginata, Echinococcus granulosus, Mesocestoidescorti, Mycoplasma arthritidis, M. hyorhinis, M. orale, M. arginini,Acholeplasma laidlawii, M. salivarium and M. pneumoniae. In some cases,the target sequence is a portion of a nucleic acid from a genomic locus,a transcribed mRNA, or a reverse transcribed cDNA from a gene locus ofbacterium or other agents responsible for a disease in the samplecomprising a mutation that confers resistance to a treatment, such as asingle nucleotide mutation that confers resistance to antibiotictreatment.

In some embodiments, compositions, systems, and methods described hereincomprise a modified target nucleic acid which can describe a targetnucleic acid wherein the target nucleic acid has undergone amodification, for example, after contact with an effector protein. Insome cases, the modification is an alteration in the sequence of thetarget nucleic acid. In some cases, the modified target nucleic acidcomprises an insertion, deletion, or replacement of one or morenucleotides compared to the unmodified target nucleic acid.

In some instances, the target nucleic acid sequence comprises a nucleicacid sequence of a virus, a bacterium, or other pathogen responsible fora disease in a plant (e.g., a crop). Methods and compositions of thedisclosure may be used to treat or detect a disease in a plant. Forexample, the methods of the disclosure may be used to target a viralnucleic acid sequence in a plant. A D2S effector protein of thedisclosure (e.g., Cas14) may cleave the viral nucleic acid. In someinstances, the target nucleic acid sequence comprises a nucleic acidsequence of a virus or a bacterium or other agents (e.g., any pathogen)responsible for a disease in the plant (e.g., a crop). In someinstances, the target nucleic acid comprises RNA. The target nucleicacid, in some cases, is a portion of a nucleic acid from a virus or abacterium or other agents responsible for a disease in the plant (e.g.,a crop). In some cases, the target nucleic acid is a portion of anucleic acid from a genomic locus, or any NA amplicon, such as a reversetranscribed mRNA or a cDNA from a gene locus, a transcribed mRNA, or areverse transcribed cDNA from a gene locus in at a virus or a bacteriumor other agents (e.g., any pathogen) responsible for a disease in theplant (e.g., a crop). A virus infecting the plant may be an RNA virus. Avirus infecting the plant may be a DNA virus. Non-limiting examples ofviruses that may be targeted with the disclosure include Tobacco mosaicvirus (TMV), Tomato spotted wilt virus (TSWV), Cucumber mosaic virus(CMV), Potato virus Y (PVY), Cauliflower mosaic virus (CaMV) (RT virus),Plum pox virus (PPV), Brome mosaic virus (BMV) and Potato virus X (PVX).

Mutations

In some instances, target nucleic acids comprise a mutation. In someembodiments, a composition, system or method described herein can beused to modify a target nucleic acid comprising a mutation such that themutation is modified to be a wild-type nucleotide or nucleotidesequence. In some embodiments, a composition, system or method describedherein can be used to detect a target nucleic acid comprising amutation. In some instances, a sequence comprising a mutation may bemodified to a wildtype sequence with a composition, system or methoddescribed herein. In some instances, a sequence comprising a mutationmay be detected with a composition, system or method described herein.The mutation may be a mutation of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,13, 14, 15, 16, 17, 18, 19, 20 or more nucleotides. The mutation maycomprise a deletion of about 5, about 10, about 15, about 20, about 25,about 30, about 35, about 40, about 45, about 50, about 55, about 60,about 65, about 70, about 75, about 80, about 85, about 90, about 95,about 100, about 200, about 300, about 400, about 500, about 600, about700, about 800, about 900, or about 1000 nucleotides. The mutation maycomprise a deletion of 1 to 5, 5 to 10, 10 to 15, 15 to 20, 20 to 25, 25to 30, 30 to 35, 35 to 40, 40 to 45, 45 to 50, 50 to 55, 55 to 60, 60 to65, 65 to 70, 70 to 75, 75 to 80, 80 to 85, 85 to 90, 90 to 95, 95 to100, 100 to 200, 200 to 300, 300 to 400, 400 to 500, 500 to 600, 600 to700, 700 to 800, 800 to 900, 900 to 1000, 1 to 50, 1 to 100, 25 to 50,25 to 100, 50 to 100, 100 to 500, 100 to 1000, or 500 to 1000nucleotides. Non-limiting examples of mutations are insertion-deletion(indel), single nucleotide polymorphism (SNP), and frameshift mutations.In some instances, guide nucleic acids described herein hybridize to aregion of the target nucleic acid comprising the mutation. The mutationmay be located in a non-coding region or a coding region of a gene.

A mutation may be in an open reading frame of a target nucleic acid. Amutation may result in the insertion of at least one amino acid in aprotein encoded by the target nucleic acid. A mutation may result in thedeletion of at least one amino acid in a protein encoded by the targetnucleic acid. A mutation may result in the substitution of at least oneamino acid in a protein encoded by the target nucleic acid. A mutationthat results in the deletion, insertion, or substitution of one or moreamino acids of a protein encoded by the target nucleic acid may resultin misfolding of a protein encoded by the target nucleic acid. Amutation may result in a premature stop codon, thereby resulting in atruncation of the encoded protein.

In some embodiments, a mutation comprises a point mutation or singlenucleotide polymorphism (SNP), a chromosomal mutation, a copy numbermutation, or any combination thereof. A point mutation optionallycomprises a substitution, insertion, or deletion. In some embodiments, amutation comprises a chromosomal mutation. A chromosomal mutations cancomprise an inversion, a deletion, a duplication, or a translocation ofone or more nucleotides. In some embodiments, a mutation comprises acopy number variation. A copy number variation can comprise a geneamplification or an expanding trinucleotide repeat. In some embodiments,guide nucleic acids described herein hybridize to a target sequence of atarget nucleic acid comprising the mutation. In some embodiments,mutations are located in a non-coding region of a gene.

In some instances, target nucleic acids comprise a mutation, wherein themutation is a SNP. The single nucleotide mutation or SNP may beassociated with a phenotype of the sample or a phenotype of the organismfrom which the sample was taken. The SNP, in some cases, is associatedwith altered phenotype from wild type phenotype. In some embodiments, asingle nucleotide mutation, SNP, or deletion described herein isassociated with a disease, such as a genetic disease. The SNP may be asynonymous substitution or a nonsynonymous substitution. Thenonsynonymous substitution may be a missense substitution or a nonsensepoint mutation. The synonymous substitution may be a silentsubstitution. The mutation may be a deletion of one or more nucleotides.Often, the single nucleotide mutation, SNP, or deletion is associatedwith a disease such as cancer or a genetic disorder. The mutation, suchas a single nucleotide mutation, a SNP, or a deletion, may be encoded inthe sequence of a target nucleic acid from the germline of an organismor may be encoded in a target nucleic acid from a diseased cell, such asa cancer cell.

In some embodiments, the target nucleic acid comprises a mutationassociated with a disease. In some examples, a mutation associated witha disease refers to a mutation whose presence in a subject indicatesthat the subject is susceptible to or suffers from, a disease, disorder,condition, or syndrome. In some examples, a mutation associated with adisease refers to a mutation which causes, contributes to thedevelopment of, or indicates the existence of the disease, disorder,condition, or syndrome. A mutation associated with a disease may alsorefer to any mutation which generates transcription or translationproducts at an abnormal level, or in an abnormal form, in cells affectedby a disease relative to a control without the disease. In someexamples, a mutation associated with a disease refers to a mutationwhose presence in a subject indicates that the subject is susceptibleto, or suffers from, a disease, disorder, or pathological state. In someembodiments, a mutation associated with a disease, comprises theco-occurrence of a mutation and the phenotype of a disease. The mutationmay occur in a gene, wherein transcription or translation products fromthe gene occur at a significantly abnormal level or in an abnormal formin a cell or subject harboring the mutation as compared to a non-diseasecontrol subject not having the mutation.

In some instances, target nucleic acids comprise a mutation, wherein themutation is a deletion of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,15, 16, 17, 18, 19, 20 or more nucleotides. In some embodiments, atarget nucleic acid comprises a mutation of 1, 2, 3, 4, 5, 6, 7, 8, 9,10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more nucleotides. Themutation may be a deletion of about 5, about 10, about 15, about 20,about 25, about 30, about 35, about 40, about 45, about 50, about 55,about 60, about 65, about 70, about 75, about 80, about 85, about 90,about 95, about 100, about 200, about 300, about 400, about 500, about600, about 700, about 800, about 900, or about 1000 nucleotides. Themutation may be a deletion of 1 to 5, 5 to 10, 10 to 15, 15 to 20, 20 to25, 25 to 30, 30 to 35, 35 to 40, 40 to 45, 45 to 50, 50 to 55, 55 to60, 60 to 65, 65 to 70, 70 to 75, 75 to 80, 80 to 85, 85 to 90, 90 to95, 95 to 100, 100 to 200, 200 to 300, 300 to 400, 400 to 500, 500 to600, 600 to 700, 700 to 800, 800 to 900, 900 to 1000, 1 to 50, 1 to 100,25 to 50, 25 to 100, 50 to 100, 100 to 500, 100 to 1000, or 500 to 1000nucleotides.

Certain Samples

Various sample types comprising a target nucleic acid of interest areconsistent with the present disclosure. These samples may comprise atarget nucleic acid sequence for detection. In some instances, thedetection of the target nucleic indicates an ailment, such as a disease,cancer, or genetic disorder, or genetic information, such as forphenotyping, genotyping, or determining ancestry and are compatible withthe reagents and support mediums as described herein. Generally, asample from an individual or an animal or an environmental sample may beobtained to test for presence of a disease, cancer, genetic disorder, orany mutation of interest.

In some instances, the sample is a biological sample, an environmentalsample, or a combination thereof. Non-limiting examples of biologicalsamples are blood, serum, plasma, saliva, urine, mucosal sample,peritoneal sample, cerebrospinal fluid, gastric secretions, nasalsecretions, sputum, pharyngeal exudates, urethral or vaginal secretions,an exudate, an effusion, and a tissue sample (e.g., a biopsy sample). Atissue sample from a subject may be dissociated or liquified prior toapplication to detection system of the present disclosure. Non-limitingexamples of environmental samples are soil, air, or water. In someinstances, an environmental sample is taken as a swab from a surface ofinterest or taken directly from the surface of interest.

In some instances, the sample is a raw (unprocessed, unmodified) sample.Raw samples may be applied to a system for detecting or modifying atarget nucleic acid, such as those described herein. In some instances,the sample is diluted with a buffer or a fluid or concentrated prior toits application to the system or be applied neat to the detectionsystem. Sometimes, the sample contains no more 20 μL of buffer or fluid.The sample, in some cases, is contained in no more than 1, 5, 10, 15,20, 25, 30, 35 40, 45, 50, 55, 60, 65, 70, 75, 80, 90, 100, 200, 300,400, 500 μl, or any of value 1 μl to 500 μl, preferably 10 μL to 200 μL,or more preferably 50 μL to 100 μL of buffer or fluid. Sometimes, thesample is contained in more than 500 μl.

In some instances, the sample is taken from a single-cell eukaryoticorganism; a plant or a plant cell; an algal cell; a fungal cell; ananimal cell, tissue, or organ; a cell, tissue, or organ from aninvertebrate animal; a cell, tissue, fluid, or organ from a vertebrateanimal such as fish, amphibian, reptile, bird, and mammal; a cell,tissue, fluid, or organ from a mammal such as a human, a non-humanprimate, an ungulate, a feline, a bovine, an ovine, and a caprine. Insome instances, the sample is taken from nematodes, protozoans,helminths, or malarial parasites. In some cases, the sample comprisesnucleic acids from a cell lysate from a eukaryotic cell, a mammaliancell, a human cell, a prokaryotic cell, or a plant cell. In some cases,the sample comprises nucleic acids expressed from a cell.

In some instances, samples are used for diagnosing a disease. In someinstances the disease is cancer. The sample used for cancer testing maycomprise at least one target nucleic acid that may bind to a guidenucleic acid of the reagents described herein. The target nucleic acid,in some cases, comprises a portion of a gene comprising a mutationassociated with cancer, a gene whose overexpression is associated withcancer, a tumor suppressor gene, an oncogene, a checkpoint inhibitorgene, a gene associated with cellular growth, a gene associated withcellular metabolism, or a gene associated with cell cycle. Sometimes,the target nucleic acid encodes a cancer biomarker, such as a prostatecancer biomarker or non-small cell lung cancer. In some cases, the assaymay be used to detect “hotspots” in target nucleic acids that may bepredictive of lung cancer. In some cases, the target nucleic acidcomprises a portion of a nucleic acid that is associated with a bloodfever. In some cases, the target nucleic acid is a portion of a nucleicacid from a genomic locus, any DNA amplicon of, a reverse transcribedmRNA, or a cDNA from a locus of at least one of: ALK, APC, ATM, AXIN2,BAP1, BARD1, BLM, BMPR1A, BRCA1, BRCA2, BRIP1, CASR, CDC73, CDH1, CDK4,CDKN1B, CDKN1C, CDKN2A, CEBPA, CHEK2, CTNNA1, DICER1, DIS3L2, EGFR,EPCAM, FH, FLCN, GATA2, GPC3, GREM1, HOXB13, HRAS, system, MAX, MEN1,MET, MITF, MLH1, MSH2, MSH3, MSH6, MUTYH, NBN, NF1, NF2, NTHL1, PALB2,PDGFRA, PHOX2B, PMS2, POLD1, POLE, POT1, PRKAR1A, PTCH1, PTEN, RAD50,RAD51C, RAD51D, RB1, RECQL4, RET, RUNX1, SDHA, SDHAF2, SDHB, SDHC, SDHD,SMAD4, SMARCA4, SMARCB1, SMARCE1, STK11, SUFU, TERC, TERT, TMEM127,TP53, TSC1, TSC2, VHL, WRN, and WT1. Any region of the aforementionedgene loci may be probed for a mutation or deletion using thecompositions and methods disclosed herein. For example, in the EGFR genelocus, the compositions and methods for detection disclosed herein maybe used to detect a single nucleotide polymorphism or a deletion.

In some instances, samples are used to diagnose a genetic disorder, alsoreferred to as genetic disorder testing. The sample used for geneticdisorder testing may comprise at least one target nucleic acid that maybind to a guide nucleic acid of the reagents described herein. In someinstances, the genetic disorder is hemophilia, sickle cell anemia,β-thalassemia, Duchene muscular dystrophy, severe combinedimmunodeficiency, Huntington's disease, or cystic fibrosis. The targetnucleic acid, in some cases, is from a gene with a mutation associatedwith a genetic disorder, from a gene whose overexpression is associatedwith a genetic disorder, from a gene associated with abnormal cellulargrowth resulting in a genetic disorder, or from a gene associated withabnormal cellular metabolism resulting in a genetic disorder. In somecases, the target nucleic acid is a nucleic acid from a genomic locus, atranscribed mRNA, or a reverse transcribed mRNA, a DNA amplicon of or acDNA from a locus of at least one of: CFTR, FMR1, SMN1, ABCB11, ABCC8,ABCD1, ACAD9, ACADM, ACADVL, ACAT1, ACOX1, ACSF3, ADA, ADAMTS2, ADGRG1,AGA, AGL, AGPS, AGXT, AIRE, ALDH3A2, ALDOB, ALG6, ALMS1, ALPL, AMT,AQP2, ARG1, ARSA, ARSB, ASL, ASNS, ASPA, ASS1, ATM, ATP6V1B1, ATP7A,ATP7B, ATRX, BBS1, BBS10, BBS12, BBS2, BCKDHA, BCKDHB, BCS1L, BLM, BSND,CAPN3, CBS, CDH23, CEP290, CERKL, CHM, CHRNE, CIITA, CLN3, CLN5, CLN6,CLN8, CLRN1, CNGB3, COL27A1, COL4A3, COL4A4, COL4A5, COL7A1, CPS1,CPT1A, CPT2, CRB1, CTNS, CTSK, CYBA, CYBB, CYP11B1, CYP11B2, CYP17A1,CYP19A1, CYP27A1, DBT, DCLRE1C, DHCR7, DHDDS, DLD, DMD, DNAH5, DNAI1,DNAI2, DYSF, EDA, EIF2B5, EMD, ERCC6, ERCC8, ESCO2, ETFA, ETFDH, ETHE1,EVC, EVC2, EYS, F9, FAH, FAM161A, FANCA, FANCC, FANCG, FH, FKRP, FKTN,G6PC, GAA, GALC, GALK1, GALT, GAMT, GBA, GBE1, GCDH, GFM1, GJB1, GJB2,GLA, GLB1, GLDC, GLE1, GNE, GNPTAB, GNPTG, GNS, GRHPR, HADHA, HAX1,HBA1, HBA2, HBB, HEXA, HEXB, HGSNAT, HLCS, HMGCL, HOGA1, HPS1, HPS3,HSD17B4, HSD3B2, HYAL1, HYLS1, IDS, IDUA, IKBKAP, IL2RG, WD, KCNJ11,LAMA2, LAMA3, LAMB3, LAMC2, LCA5, LDLR, LDLRAP1, LHX3, LIFR, LIPA,LOXHD1, LPL, LRPPRC, MAN2B1, MCOLN1, MED17, MESP2, MFSD8, MKS1, MLC1,MMAA, MMAB, MMACHC, MMADHC, MPI, MPL, MPV17, MTHFR, MTM1, MTRR, MTTP,MUT, MYO7A, NAGLU, NAGS, NBN, NDRG1, NDUFAF5, NDUFS6, NEB, NPC1, NPC2,NPHS1, NPHS2, NR2E3, NTRK1, OAT, OPA3, OTC, PAH, PC, PCCA, PCCB, PCDH15,PDHA1, PDHB, PEX1, PEX10, PEX12, PEX2, PEX6, PEX7, PFKM, PHGDH, PKHD1,PMM2, POMGNT1, PPT1, PROP1, PRPS1, PSAP, PTS, PUS1, PYGM, RAB23, RAG2,RAPSN, RARS2, RDH12, RMRP, RPE65, RPGRIP1L, RS1, RTEL1, SACS, SAMHD1,SEPSECS, SGCA, SGCB, SGCG, SGSH, SLC12A3, SLC12A6, SLC17A5, SLC22A5,SLC25A13, SLC25A15, SLC26A2, SLC26A4, SLC35A3, SLC37A4, SLC39A4,SLC4A11, SLC6A8, SLC7A7, SMARCAL1, SMPD1, STAR, SUMF1, TAT, TCIRG1,TECPR2, TFR2, TGM1, TH, TMEM216, TPP1, TRMU, TSFM, TTPA, TYMP, USH1C,USH2A, VPS13A, VPS13B, VPS45, VRK1, VSX2, WNT10A, XPA, XPC, and ZFYVE26.

The sample used for phenotyping testing may comprise at least one targetnucleic acid that may bind to a guide nucleic acid of the reagentsdescribed herein. The target nucleic acid, in some cases, is a nucleicacid encoding a sequence associated with a phenotypic trait.

The sample used for genotyping testing may comprise at least one targetnucleic acid that may bind to a guide nucleic acid of the reagentsdescribed herein. The target nucleic acid, in some cases, is a nucleicacid encoding a sequence associated with a genotype of interest.

The sample used for ancestral testing may comprise at least one targetnucleic acid that may bind to a guide nucleic acid of the reagentsdescribed herein. The target nucleic acid, in some cases, is a nucleicacid encoding a sequence associated with a geographic region of originor ethnic group.

The sample may be used for identifying a disease status. For example, asample is any sample described herein, and is obtained from a subjectfor use in identifying a disease status of a subject. The disease may bea cancer or genetic disorder. Sometimes, a method comprises obtaining aserum sample from a subject; and identifying a disease status ofthesubject. Often, the disease status is prostate disease status, but thestatus of any disease may be assessed.

Any of the above disclosed samples are consistent with the methods,compositions, reagents, enzymes, and systems disclosed herein.

EXEMPLARY EMBODIMENTS

1. A composition comprising an effector protein, or a nucleic acidencoding the effector protein, and a guide nucleic acid, or a nucleicacid encoding the guide nucleic acid, wherein the effector proteincomprises an amino acid sequence that is (a) at least 50%, at least 60%,at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, atleast 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 23and (b) includes six amino acid sequences selected from the group:

-   -   (i) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 793,    -   (ii) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 794,    -   (iii) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 795,    -   (iv) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 796,    -   (v) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 797,    -   (vi) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 798, and    -   (vii) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 799, and wherein the        effector protein interacts with the guide nucleic acid to form a        complex that is targeted to a target sequence via base pairing        between the guide nucleic acid and the target sequence.

2. The composition of embodiment 1, wherein the effector proteincomprises seven amino acid sequences selected from the group:

-   -   (i) an amino acid sequence that is at least 40%, at least 50%,        at least 60%, at least 70%, at least 80%, at least 90%, at least        95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:        793,    -   (ii) an amino acid sequence that is at least 40%, at least 50%,        at least 60%, at least 70%, at least 80%, at least 90%, at least        95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:        794,    -   (iii) an amino acid sequence that is at least 40%, at least 50%,        at least 60%, at least 70%, at least 80%, at least 90%, at least        95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:        795,    -   (iv) an amino acid sequence that is at least 40%, at least 50%,        at least 60%, at least 70%, at least 80%, at least 90%, at least        95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:        796,    -   (v) an amino acid sequence that is at least 40%, at least 50%,        at least 60%, at least 70%, at least 80%, at least 90%, at least        95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:        797,    -   (vi) an amino acid sequence that is at least 40%, at least 50%,        at least 60%, at least 70%, at least 80%, at least 90%, at least        95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:        798, and    -   (vii) an amino acid sequence that is at least 40%, at least 50%,        at least 60%, at least 70%, at least 80%, at least 90%, at least        95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:        799.

3. The composition of embodiment 1 or embodiment 2, wherein the effectorprotein comprises six amino acid sequences selected from the group:

(i) an amino acid sequence that is at least 69.5% identical to SEQ IDNO: 793,

(ii) an amino acid sequence that is at least 69.5% identical to SEQ IDNO: 794,

(iii) an amino acid sequence that is at least 69.5% identical to SEQ IDNO: 795,

(iv) an amino acid sequence that is at least 69.5% identical to SEQ IDNO: 796,

(v) an amino acid sequence that is at least 69.5% identical to SEQ IDNO: 797,

(vi) an amino acid sequence that is at least 69.5% identical to SEQ IDNO: 798, and

(vii) an amino acid sequence that is at least 69.5% identical to SEQ IDNO: 799.

4. The composition of any preceding embodiment, wherein the effectorprotein comprises six amino acid sequences selected from the group:

(i) an amino acid sequence that is at least 80% identical to SEQ ID NO:793,

(ii) an amino acid sequence that is at least 80% identical to SEQ ID NO:794,

(iii) an amino acid sequence that is at least 80% identical to SEQ IDNO: 795,

(iv) an amino acid sequence that is at least 80% identical to SEQ ID NO:796,

(v) an amino acid sequence that is at least 80% identical to SEQ ID NO:797,

(vi) an amino acid sequence that is at least 80% identical to SEQ ID NO:798, and

(vii) an amino acid sequence that is at least 80% identical to SEQ IDNO: 799.

5. The composition of any one of the preceding embodiments, wherein theeffector protein comprises an amino acid sequence that is at least 68%identical to SEQ ID NO:23.

6. A composition comprising an effector protein, or a nucleic acidencoding the effector protein, and a guide nucleic acid, or a nucleicacid encoding the guide nucleic acid, wherein the effector proteincomprises a sequence of amino acids that is at least 37%, at least 40%,at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, atleast 95%, at least 98%, at least 99% or 100% identical to SEQ ID NO:796, and wherein the effector protein interacts with the guide nucleicacid to form a complex that is targeted to a target sequence via basepairing between the guide nucleic acid and the target sequence.

7. The composition of embodiment 6, wherein the effector protein furthercomprises four amino acid sequences selected from the group:

-   -   (i) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 793, preferably wherein the        sequence is at least 69.5% identical to SEQ ID NO: 793,    -   (ii) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 794, preferably wherein the        sequence is at least 69.5% identical to SEQ ID NO: 794,    -   (iii) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 795, preferably wherein the        sequence is at least 69.5% identical to SEQ ID NO: 795,    -   (iv) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 797, preferably wherein the        sequence is at least 69.5% identical to SEQ ID NO: 797,    -   (v) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 798, preferably wherein the        sequence is at least 69.5% identical to SEQ ID NO: 798, and    -   (vi) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 799 preferably wherein the        sequence is at least 69.5% identical to SEQ ID NO: 799.

8. The composition of embodiment 6, wherein the effector protein furthercomprises five amino acid sequences selected from the group:

-   -   (i) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 793, preferably wherein the        sequence is at least 69.5% identical to SEQ ID NO: 793,    -   (ii) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 794, preferably wherein the        sequence is at least 69.5% identical to SEQ ID NO: 794,    -   (iii) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 795, preferably wherein the        sequence is at least 69.5% identical to SEQ ID NO: 795,    -   (iv) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 797, preferably wherein the        sequence is at least 69.5% identical to SEQ ID NO: 797,    -   (v) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 798, preferably wherein the        sequence is at least 69.5% identical to SEQ ID NO: 798, and    -   (vi) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 799 preferably wherein the        sequence is at least 69.5% identical to SEQ ID NO: 799.

9. The composition of embodiment 6, wherein the effector protein furthercomprises six amino acid sequences selected from the group:

-   -   (i) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 793, preferably wherein the        sequence is at least 69.5% identical to SEQ ID NO: 793,    -   (ii) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 794, preferably wherein the        sequence is at least 69.5% identical to SEQ ID NO: 794,    -   (iii) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 795, preferably wherein the        sequence is at least 69.5% identical to SEQ ID NO: 795,    -   (iv) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 797, preferably wherein the        sequence is at least 69.5% identical to SEQ ID NO: 797,    -   (v) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 798, preferably wherein the        sequence is at least 69.5% identical to SEQ ID NO: 798, and    -   (vi) an amino acid sequence that is at least 60%, at least 70%,        at least 80%, at least 90%, at least 95%, at least 98%, at least        99% or 100% identical to SEQ ID NO: 799 preferably wherein the        sequence is at least 69.5% identical to SEQ ID NO: 799.

10. The composition of any one of embodiments 6 to 9, wherein theeffector protein comprises an amino acid sequence that is at least 50%,at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, atleast 90%, at least 95%, at least 98%, at least 99% or 100% identical toSEQ ID NO: 23.

11. The composition of any one of the preceding embodiments, wherein theamino acid sequences having at least the threshold identity with any oneof SEQ ID NO: 793 to SEQ ID NO: 799 are in the following order startingfrom the N terminus:

(i) the sequence having at least the threshold identity with SEQ ID NO:796

(ii) the sequence having at least the threshold identity with SEQ ID NO:797

(iii) the sequence having at least the threshold identity with SEQ IDNO: 795

(iv) the sequence having at least the threshold identity with SEQ ID NO:799

(v) the sequence having at least the threshold identity with SEQ ID NO:794

(vi) the sequence having at least the threshold identity with SEQ ID NO:793

(vii) the sequence having at least the threshold identity with SEQ IDNO: 798.

12. The composition of any one of the preceding embodiments, wherein theeffector protein comprises an amino acid sequence that is identical toSEQ ID NO:23.

13. The composition of any one of the preceding embodiments, wherein thewherein the guide nucleic acid is an engineered guide nucleic acid.

14. The composition of any one of the preceding embodiments, wherein theguide nucleic acid comprises a repeat region that is least 75%, at least80%, at least 85%, at least 90%, at least 95%, or 100% identical to anyone of SEQ ID NOs 630, 641, and 827-929.

15. The composition of any one of the preceding embodiments wherein theguide nucleic acid comprises a crRNA and a tracrRNA, optionally whereinthe guide nucleic acid is a single guide nucleic acid.

16. The composition of any one of the preceding embodiments, wherein theeffector protein is about 380 to about 850 amino acids in length.

17. The composition of embodiment 16, wherein the effector protein isabout 400 to about 550 amino acids in length.

18. The composition of any one of the preceding embodiments, wherein theeffector protein is fused to a fusion partner.

19. The composition of embodiment 18, wherein the effector protein isfused to the fusion partner via a linker protein.

20. The composition of embodiment 18 or embodiment 19, wherein theeffector protein is fused to a fusion partner at the N-terminus and/orthe C-terminus.

21. The composition of any one of embodiments 18-20, wherein the fusionpartner:

-   -   (a) modulates transcription;    -   (b) has an enzymatic activity that modifies the target nucleic        acid;    -   (c) has an enzymatic activity that modifies a protein associated        with the target nucleic acid;    -   (d) modifies a nucleobase of the target nucleic acid, optionally        wherein the fusion partner is a deaminase;    -   (e) comprises a chloroplast transit peptide;    -   (f) comprises an endosomal escape peptide; and/or    -   (g) comprises a nuclear localisation signal.

22. The composition of any one of the preceding embodiments, wherein theeffector protein is modified to reduce the nucleic acid-cleavingactivity of the effector protein.

23. The composition of embodiment 22, wherein the effector protein isenzymatically inactive.

24. The composition of any one of the preceding embodiments, wherein thecomposition further comprises a donor nucleic acid.

25. A method of detecting a target nucleic acid in a sample, comprising:

(a) contacting the sample with:

-   -   (i) the composition of any one of embodiments 1-23; and    -   (ii) a reporter nucleic acid, wherein a detectable signal is        produced when the reporter nucleic acid is cleaved by the        effector protein.

(b) detecting the detectable signal.

26. A method of modifying a target nucleic acid, the method comprisingcontacting the target nucleic acid with the composition of any one ofembodiments 1-24.

27. The method of embodiment 26, wherein modifying the target nucleicacid comprises cleaving the target nucleic acid, deleting a nucleotideof the target nucleic acid, inserting a nucleotide into the targetnucleic acid, substituting a nucleotide of the target nucleic acid witha donor nucleotide or an additional nucleotide, or any combinationthereof.

28. The method of embodiment 26 or embodiment 27, wherein the contactingoccurs in vitro, in vivo or ex vivo.

29. The method of embodiment 28, wherein the contracting comprisesintroducing the composition of any one of embodiments 1-24 into a cell,optionally wherein the cell is a eukaryotic cell.

30. A cell modified by the method of embodiment 29.

31. The composition of any one of embodiments 1-24 for use in therapy.

32. A method of treating a patient comprising administering thecomposition of any one of embodiments 1-24.

EXAMPLES

The following examples are included for illustrative purposes only andare not intended to limit the scope of the invention.

Example 1: PAM Screening for D2S Effector Proteins

D2S effector proteins and guide RNA combinations represented in TABLE 5were screened by in vitro enrichment (IVE) for PAM recognition. TABLE 5shows the components of each effector protein-guide RNA complex assayedfor PAM recognition. The amino acid sequences of the effector proteinnames in the second column of the table are shown in TABLE 1 herein. Thenucleobase sequences of the guide components in the third through sixthcolumns of the table are shown in TABLE 2 and TABLE 3 herein. Forexample, as shown in TABLE 2, an effector protein comprising an aminoacid sequence of SEQ ID NO:1 complexed with a guide comprising a crRNAof SEQ ID NO: 46 and a tracrRNA of SEQ ID NO: 91 was screened for PAMrecognition. Briefly, effector proteins were complexed withcorresponding guide RNAs for 15 minutes at 37° C. The complexes wereadded to an IVE reaction mix. PAM screening reactions used 10 μl of RNPin 100 μl reactions with 1,000 ng of a 5′ PAM library in 1× Cutsmartbuffer and were carried out for 15 minutes at 25° C., 45 minutes at 37°C. and 15 minutes at 45° C. Reactions were terminated with 1 μl ofproteinase K and 5 μl of 500 mM EDTA for 30 minutes at 37° C. Nextgeneration sequencing was performed on cut sequences to identifyenriched PAMs. As shown in TABLE 5, cis cleavages were observed with RNPcomplexes comprising D2S effector proteins and corresponding guide RNAs.

TABLE 5 Observed Cis Cleavage for Effector Protein/Guide Combinationscis- Comp. cleavage No: Effector Protein (y/n) crRNA # tracrRNA # sgRNA# 1 CasM.298706 Y R4879 (SEQ ID R4935 (SEQ ID NO: — (SEQ ID NO: 1) NO:46) 91) 4 CasM.284933 (SEQ Y R4841 (SEQ ID R4902 (SEQ ID NO: — ID NO: 4)NO: 49) 94) 13 CasM.297894 (SEQ Y R4987 (SEQ ID R4904 (SEQ ID NO: — IDNO: 13) NO: 58) 103) 14 CasM.291449 (SEQ N R4875 (SEQ ID R4939 (SEQ IDNO: — ID NO: 14) NO: 59) 104) 15 CasM.291449 (SEQ N R4875 (SEQ ID R4938(SEQ ID NO: — ID NO: 14) NO: 59) 105) 16 CasM.297599 (SEQ Y R4876 (SEQID R4892 (SEQ ID NO: — ID NO: 15) NO: 60) 106) 17 CasM.297599 (SEQ YR4876 (SEQ ID R4942 (SEQ ID NO: — ID NO: 15) NO: 60) 107) 23 CasM.292335(SEQ Y R4851 (SEQ ID R4907 (SEQ ID NO: — ID NO: 18) NO: 63) 113) 24CasM.293576 (SEQ Y R4852 (SEQ ID R4896 (SEQ ID NO: — ID NO: 19) NO: 64)114) 28 CasM.298538 (SEQ Y R4854 (SEQ ID R4897 (SEQ ID NO: — ID NO: 21)NO: 66) 118) 30 CasM.19924 (SEQ Y R4855 (SEQ ID R4893 (SEQ ID — ID NO:22) NO: 67) NO: 120) 31 CasM.19924 (SEQ Y — — R4886 (SEQ ID ID NO: 22)NO: 149) 32 CasM.19952 (SEQ Y R4856 (SEQ ID R4893 (SEQ ID NO: — ID NO:23) NO: 68) 120) 33 CasM.19952 (SEQ Y — — R4886 (SEQ ID ID NO: 23) NO:149) 34 CasM.274559 (SEQ Y R4857 (SEQ ID R4894 (SEQ ID NO: — ID NO: 24)NO: 69) 121) 35 CasM.274559 (SEQ Y — — R4887(SEQ ID ID NO: 24) NO: 150)36 CasM.286251 (SEQ Y R4858 (SEQ ID R4910 (SEQ ID NO: — ID NO: 25) NO:70) 122) 37 CasM.286251 (SEQ Y — — R4882 (SEQ ID ID NO: 25) NO: 151) 39CasM.288480 (SEQ Y — — R4886 (SEQ ID ID NO: 26) NO: 149) 41 CasM.289206Y R4861 (SEQ ID R4894 (SEQ ID NO: — 289206 (SEQ ID NO: NO: 73) 121) 28)42 CasM.289206 (SEQ Y — — R4887 (SEQ ID ID NO: 28) NO: 150) 43CasM.290598 (SEQ Y R4862 (SEQ ID R4894 (SEQ ID NO: — ID NO: 29) NO: 74)121) 45 CasM.290816 (SEQ Y R4863 (SEQ ID R4912 (SEQ ID NO: — ID NO: 30)NO: 75) 124) 48 CasM.295071 (SEQ Y — — R4882(SEQ ID ID NO: 31) NO: 151)50 CasM.295231(SEQ Y — — R4884 (SEQ ID ID NO: 32) NO: 152) 54CasM.279423 (SEQ Y R4857 (SEQ ID R4894 (SEQ ID NO: — ID NO: 34) NO: 79)127) 71 CasM.295105 (SEQ Y R4872(SEQ ID R4925 (SEQ ID NO: — ID NO: 43)NO: 88) 144) 72 CasM.295187 (SEQ Y R4873 (SEQ ID R4945 (SEQ ID NO: — IDNO: 44) NO: 89) 145) 74 CasM.295929 (SEQ Y R4874 (SEQ ID R4928 (SEQ IDNO: — ID NO: 45) NO: 90) 147) 75 CasM.295929 (SEQ Y R4874 (SEQ ID R4927(SEQ ID NO: — ID NO: 45) NO: 90) 148)

TABLE 6 Exemplary PAM Sequences Comp. Effector Protein Amino Acid NoName SEQ ID NO: PAM Sequence 1 CasM.298706 1 CTT (SEQ ID NO: 154) 13CasM.297894 13 CTT (SEQ ID NO: 154) 16 CasM.297599 15CC (SEQ ID NO: 155) 17 CasM.297599 15 CC (SEQ ID NO: 155) 23 CasM.29233518 CC (SEQ ID NO: 155) 24 CasM.293576 19 CC (SEQ ID NO: 155) 28CasM.298538 21 TC (SEQ ID NO: 164) 30 CasM.19924 22 TCG (SEQ ID NO: 156)31 CasM.19924 22 GCG (SEQ ID NO: 157) 32 CasM.19952 23TCG (SEQ ID NO: 156), TTG (SEQ ID NO: 158),GCG (SEQ ID NO: 157), GTG (SEQ ID NO: 159) 33 CasM.19952 23TCG (SEQ ID NO: 156), TTG (SEQ ID NO: 158),GCG (SEQ ID NO: 157), GTG (SEQ ID NO: 159) 34 CasM.274559 24TCG (SEQ ID NO: 156) 35 CasM.274559 24 TCG (SEQ ID NO: 156) 36CasM.286251 25 ATTA (SEQ ID NO: 160), ATTG (SEQ ID NO: 161),GTTA (SEQ ID NO: 162), GTTG (SEQ ID NO: 163) 37 CasM.286251 25ATTA (SEQ ID NO: 160), ATTG (SEQ ID NO: 161),GTTA (SEQ ID NO: 162), GTTG (SEQ ID NO: 163) 39 CasM.288480 26TCG (SEQ ID NO: 156) 41 CasM.289206 28ATTA (SEQ ID NO: 160), ATTG (SEQ ID NO: 161),GTTA (SEQ ID NO: 162), GTTG (SEQ ID NO: 163) 42 CasM.289206 28ATTA (SEQ ID NO: 160), ATTG (SEQ ID NO: 161),GTTA (SEQ ID NO: 162), GTTG (SEQ ID NO: 163) 43 CasM.290598 29ATTG (SEQ ID NO: 161), ACTG (SEQ ID NO: 165),GTTG (SEQ ID NO: 163), GCTG (SEQ ID NO: 166) 46 CasM.290816 30TCG (SEQ ID NO: 156) 48 CasM.295071 31ATTA (SEQ ID NO: 160), ATTG (SEQ ID NO: 161),GTTA (SEQ ID NO: 162), GTTG (SEQ ID NO: 163) 50 CasM.295231 32TCG (SEQ ID NO: 156) or GCG (SEQ ID NO: 157) 54 CasM.279423 34ATTA (SEQ ID NO: 160), ATTG (SEQ ID NO: 161),GTTA (SEQ ID NO: 162), GTTG (SEQ ID NO: 163) 71 CasM.295105 43TTC (SEQ ID NO: 167) 72 CasM.295187 44 TTC (SEQ ID NO: 167) 74CasM.295929 45 TTT (SEQ ID NO: 168), TTC (SEQ ID NO: 167) 75 CasM.29592945 TTT (SEQ ID NO: 168), TTC (SEQ ID NO: 167)

FIG. 1 illustrates the composition of the sequences derived fromlibraries digested with RNP complexes comprising the denoted D2Seffector proteins. As shown in FIG. 1 , examination of the PFM derivedWebLogos (FIG. 1 ) revealed the presence of enriched 5′ PAM consensussequences for the various D2S effector proteins.

Example 2: DETECTR Activity of D2S Effector Proteins

D2S effector proteins were tested for trans cleavage. Briefly, partiallypurified (nickel-NTA purified) D2S effector proteins were incubated withcrRNA and tracrRNA or sgRNAs in a trans cleavage buffer (20 mM Tricine,15 mM MgCl2, 0.2 mg/ml BSA, 1 mM TCEP (pH 9 at 37° C.) at roomtemperature for 20 minutes, followed by addition of target nucleic acidat a final concentration of 10 nM to produce effector-protein guidecomplexes. The components of the effector-protein guide complexes thatwere assayed are provided in TABLE 7. Trans cleavage activity wasdetected by fluorescence signal upon cleavage of a fluorophore-quencherreporter in a DETECTR reaction. Dilutions were of the effector-proteinguide complexes were performed, and the assay repeated at 1%, 0.1% or0.01% of the original protein concentration. The dilution that providedthe highest signal ratio is listed.

TABLE 7 Observed Trans Cleavage for Effector Protein/Guide CombinationComp. Effector Fold No: Protein on/off ** Dilution *** Plasmid # crRNA #tracrRNA # sgRNA # 25 CasM.293576 1.69 0.1 PL3316 R4852 (SEQ R4908 (SEQ— (SEQ ID NO: 19) ID NO: 64) ID NO: 115) 26 CasM.294537 2.97 0.1 PL3320R4853 (SEQ R4941 (SEQ — (SEQ ID NO: 20) ID NO: 65) ID NO: 116) 27CasM.294537 2.05 0.01 PL3320 R4853 (SEQ R4940 (SEQ — (SEQ ID NO: 20) IDNO: 65) ID NO: 117) 31 CasM.19924 1.62 0.01 PL3295 — — R4886 (SEQ (SEQID NO: 22) ID NO: 149) 32 CasM.19952 2.08 0.1 PL3296 R4856 (SEQ R4893(SEQ — (SEQ ID NO: 23) ID NO: 68) ID NO: 120) 34 CasM.274559 2.42 0.1PL3297 R4857 (SEQ R4894 (SEQ — (SEQ ID NO: 24) ID NO: 69) ID NO: 121) 38CasM.288480 2.74 0.01 PL3307 R4859 (SEQ R4893 (SEQ — (SEQ ID NO: 26) IDNO: 71) ID NO: 120) 39 CasM.288480 2.77 0.1 PL3307 — — R4886 (SEQ (SEQID NO: 26) ID NO: 149) 41 CasM.289206 1.8 0.01 PL3310 R4861 (SEQ R4894(SEQ — 289206 (SEQ ID ID NO: 73) ID NO: 121) NO: 28) 42 CasM.289206 1.580.01 PL3310 — — R4887 (SEQ (SEQ ID NO: 28) ID NO: 150) 44 CasM.2905981.64 0.01 PL3311 — — R4887 (SEQ (SEQ ID NO: 29) ID NO: 150) 45CasM.290816 1.72 1 PL3312 R4863 (SEQ R4912 (SEQ — (SEQ ID NO: 30) ID NO:75) ID NO: 124) 46 CasM.290816 1.61 1 PL3312 — — R4884 (SEQ (SEQ ID NO:30) ID NO: 152) 51 CasM.292139 1.64 0.01 PL3314 R4989 (SEQ R4890 (SEQ —(SEQ ID NO: 33) ID NO: 78) ID NO: 125) 53 CasM.292139 1.89 1 PL3314R4885 (SEQ (SEQ ID NO: 33) ID NO: 153) 59 CasM.282952 1.52 0.01 PL3412R4867 (SEQ R4918 (SEQ (SEQ ID NO: 37) ID NO: 82) ID NO: 132) 62CasM.283262 1.66 0.1 PL3413 R4868 (SEQ R4919 (SEQ (SEQ ID NO: 38) ID NO:83) ID NO: 135) 66 CasM.291507 2.1 0.01 PL3416 R4871(SEQ R4944 (SEQ (SEQID NO: 41) ID NO: 86) ID NO: 140) 74 CasM.295929 2.25 0.1 PL3420 R4874(SEQ R4928 (SEQ (SEQ ID NO: 45) ID NO: 90) ID NO: 147) 75 CasM.2959291.65 0.1 PL3420 R4874 (SEQ R4927 (SEQ (SEQ ID NO: 45) ID NO: 90) ID NO:148) ** for those with trans-cleavage above 1.5 fold over no target ***dilution for maximum trans cleavage activity

Example 3: CasM 19952 Edits Genomic DNA in Mammalian Cells

CasM.19952 was tested for its ability to produce indels in HEK293Tcells. Briefly, a plasmid encoding CasM.19952 and a guide RNA wasdelivered by lipofection to HEK293T cells. This was performed for avariety of guide RNAs targeting up to twenty-four loci adjacent tobiochemically determined PAM sequences. Indels were detected by nextgeneration sequencing of PCR amplicons at the targeted loci and indelpercentage was calculated as the fraction of sequencing reads containinginsertions or deletions relative to an unedited reference sequence.Sequencing libraries with less than 20% of reads aligning to thereference sequence were excluded from the analysis for quality controlpurposes. “No plasmid” and SpyCas9 were included as negative andpositive controls, respectively. FIG. 2 shows the results. TABLE 8describes the sequences of the single guide RNAs tested that providedthe greatest percent of reads with indels. Non-bold, non-italicized,capital letters indicate the repeat sequence of the guide RNA;italicized letters indicate a linker; bold letters indicate the tracrRNAregion; and the lowercase letters represent the spacer sequence. Thisexperiment demonstrated that CasM.19952 is a robust editor of genomicDNA in mammalian cells.

A dose-response experiment confirmed the genome editing capability ofCasM.19952 in mammalian cells. Plasmids encoding CasM.19952 and singleguide RNAs were delivered at various concentrations by lipofection intoHEK293T. CasM.19952 was programmed to target four loci. SpyCas9 wasincluded as a positive control. Indels were observed at all four loci.Results are shown in FIG. 3 .

TABLE 8 sgRNAs that provided genome editing with CasM.19952 inHEK293T cells percent of reads with sgRNA indelsUGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUUUAUUGCACUCG 13.47 GGAAGUACCAUUUCUCA

UGGUACAUCCAACucuaggcgcccgcuaag uuc (SEQ ID NO: 180)UGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUUUAUUGCACUCG 4.63 GGAAGUACCAUUUCUCA

UGGUACAUCCAACcccggguaagccugucu gcu (SEQ ID NO: 181)UGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUUUAUUGCACUCG 19.40 GGAAGUACCALUUCUCA

UGGUACAUCCAACcgugcugnuuccucccc acg (SEQ ID NO: 182)UGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUUUAUUGCACUCG 3.15 GGAAGUACCAUUUCUCA

UGGUACAUCCAACgugccuuaguuucuuca ucu (SEQ ID NO: 183)UGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUUUAUUGCACUCG 18.35 GGAAGUACCAUUUCUCA

UGGUACAUCCAACgggggcgggggggagaa aaa (SEQ ID NO: 184)UGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUUUAUUGCACUCG 9.48 GGAAGUACCAUUUCUCA

UGGUACAUCCAACgcgcccuccgaucuggg gug (SEQ ID NO: 185)

Example 4: CasM 19952 Variants Edit Genomic DNA in Mammalian Cells withGreater Efficiency

Variants of CasM.19952 were generated and tested to identify variantswith increased binding affinity and greater genomic editing efficiencyrelative to that of CasM.19952. Briefly, plasmid constructs encodingvariants of CasM.19952 (SEQ ID NO: 23) were generated by mutatingnucleotides that encode single amino acids of interest within the REC,RuvC-I, or RuvC-II domain from the wild-type residue to arginine, withthe exception of residues that were already arginine. Generated variantshad a single amino acid alteration—an arginine (R)—at amino acidpositions A110, T111, E112, M113, S114, T115, Q116, S117, L118, S119,F122, A123, T124, E125, L126, E127, T128, N129, 1130, F131, A132, K261,V263, V264, G265, V266, D267, L268, G269, 1270, N271, V272, P273, A274,Y275, V276, A277, T278, N279, 1280, T281, E282, 1457, A458, N459, S460,K461, D462, 1463, 1464, K466, N467, or E468 as set forth in SEQ ID NOS:241-293 of TABLE 9 (positions identified with respect to SEQ ID NO: 23).Wild-type CasM.19952 (wt) (SEQ ID NO: 23) was included as a control.

Plasmid preparations of the various constructs were assessed for purityby absorbance and normalized to 100 ng/uL.

Each variant and control plasmid were incubated in reduced serum media(Opti-MEM) with equivolume of plasmids containing a sgRNA targetingeither B2M2 or B2M4 (both normalized to 100 ng/uL, 1:1 mass of sgRNAplasmid:nuclease plasmid).

The mixture containing a CasM.19952 variant plasmid constructs and sgRNAtargeting B2M2 or B2M4 DNA plasmid constructs were delivered bylipofection to HEK293T cells. Indels were detected by next generationsequencing of PCR amplicons at the targeted locis. Target and primersequences used to amplify the amplicons can be seen in TABLE 10. Indelpercentage was calculated as the fraction of sequencing reads containinginsertions or deletions relative to an unedited reference sequence.Results are shown in TABLE 11 and TABLE 12. To demonstrate relativenuclease activity, the mean of replicate values were plotted in relationto the two target loci, as grouped by domain, and normalized to the wildtype. Results can be seen in FIGS. 4-6 .

TABLE 9 Exemplary Variants of CasM.19952 SEQ ID NO: 23 Construct no.Alteration SEQ ID NO: 1 A110R 241 2 T111R 242 3 E112R 243 4 M113R 244 5S114R 245 6 T115R 246 7 Q116R 247 8 S117R 248 9 L118R 249 10 S119R 25011 F122R 251 12 A123R 252 13 T124R 253 14 E125R 254 15 L126R 255 16E127R 256 17 T128R 257 18 N129R 258 19 I130R 259 20 F131R 260 21 A132R261 22 K261R 262 23 V263R 263 24 V264R 264 25 G265R 265 26 V266R 266 27D267R 267 28 L268R 268 29 G269R 269 30 I270R 270 31 N271R 271 32 V272R272 33 P273R 273 34 A274R 274 35 Y275R 275 36 V276R 276 37 A277R 277 38T278R 278 39 N279R 279 40 I280R 280 41 T281R 281 42 E282R 282 43 I457R283 44 A458R 284 45 N459R 285 46 S460R 286 47 K461R 287 48 D462R 288 49I463R 289 50 I464R 290 51 K466R 291 52 N467R 292 53 E468R 293 54 wt 23

TABLE 10 Target Amplicon Primers Target Spacer Target Forward ReverseB2M2 GATGGATGAAA TCGTCGGCAGCGTCAGATG GTCTCGTGGGCTCGGAGA CCCAGACACTGTATAAGAGACAGCCCA TGTGTATAAGAGACAGCA (SEQ ID NO: 294) AGTGAAATACCCTGGCGTGGGGGTGAATTCAGTG (SEQ ID NO: 295) (SEQ ID NO: 296) B2M4 GGCCGAGATGTTCGTCGGCAGCGTCAGATG GTCTCGTGGGCTCGGAGA CTCGCTCCG TGTATAAGAGACAGCCTCTTGTGTATAAGAGACAGGA (SEQ ID NO: 297) CTCTAACCTGGCACT (SEQGGGTAGGAGAGACTCACG ID NO: 298) (SEQ ID NO: 299)

TABLE 11 Variants of CasM.19952 (SEQ ID NO: 23) Targeting B2M2 ConstructReplicate 1- Replicate 2- no. Indel Percentage Indel Percentage 10.0243709255 0.0264183343 2 11.80903008 11.55975252 3 0.15905626620.1213469512 4 3.909401179 4.195510803 5 9.633175559 11.74726578 611.93083574 13.08492201 7 5.841839872 6.696656784 8 0.19283585580.0889397116 9 1.801434152 3.262092239 10 0.0268326715 0.0098653381 110.070387837 0.0260586319 12 0.0272464716 0.0142257629 13 25.8323598126.42070165 14 1.615731463 2.090964591 15 11.28852581 16.0710087 1617.00047814 18.90607948 17 23.57286157 27.76788893 18 19.9310684420.97760787 19 8.294062206 9.293997272 20 7.2338181 7.218394488 2115.28013582 17.52549286 22 12.13839579 17.73327366 23 12.5401209212.29857971 24 0 0.0223580265 25 0 0.0160935572 26 0.02601118480.0059616072 27 0.0316605984 0.0118406252 28 0.054542149 0.0343760743 290.0124633888 0.0119524293 30 0.0198124422 0 31 4.04440444 3.583941914 321.672555948 2.454394693 33 16.73819743 23.40479193 34 0 0.0056471651 350.0784481529 0.0056322163 36 0.0607964333 0.031375502 37 20.6926208425.1319078 38 29.06575985 36.80249309 39 17.75051476 21.01206434 409.301425531 9.378700069 41 27.23742383 30.56776133 42 31.5072685533.27960874 43 0.0061500615 0.0119581465 44 14.77835163 17.02872382 4516.75675676 20.11758074 46 15.34582987 21.32122969 47 9.50534172410.49826475 48 20.29582318 20.18798529 49 7.359531196 8.803426593 508.905185961 11.81126487 51 14.61948354 19.14845559 52 11.4531515213.39380197 53 10.60639471 14.31117352 54 0.0203984497 0.033792346

TABLE 12 Variants of CasM.19952 (SEQ ID NO: 23) Targeting B2M4 ConstructReplicate 1- Replicate 2- no. Indel Percentage Indel Percentage 10.043185352 0.018146625 2 4.133738602 4.038123903 3 0.1695462620.073549077 4 1.873151495 1.722811875 5 2.298481933 2.992013351 65.513433935 4.681369233 7 2.365221987 2.358761113 8 0.074034190.015896988 9 0.777565328 0.699813759 10 0.029262583 0.007087675 110.081509082 0.100493331 12 0.017353579 0 13 5.93902898 8.131763208 140.861000587 0.706082518 15 4.682963379 6.506568145 16 5.4392837165.788635157 17 8.984796469 12.06173461 18 7.740565583 7.89090152 192.071005917 2.201331767 20 4.907545351 3.173109819 21 3.8949921534.444144266 22 5.706861707 8.00478919 23 5.482057219 5.428681276 240.132751754 0.042423814 25 0.006322311 0.011249859 26 0.1396798110.037074798 27 0 0.040025616 28 0.173451689 0.349344978 29 0 0.02369949130 0.016924769 0.020249064 31 0.739534568 0.882793411 32 0.3331112590.610736098 33 0.659563673 1.160872875 34 0.029513035 0.019199386 350.108069164 0.009848336 36 0.012193635 0.009329229 37 0.4225002070.818021646 38 8.529945554 10.53685168 39 2.823706249 3.787957842 402.182810368 2.912861022 41 6.361163423 9.705258539 42 7.74479699811.33583268 43 0.032425422 0 44 0.075677312 0.194590387 45 4.9405099156.497097042 46 4.612868048 5.634609094 47 2.681992337 4.139978128 484.959950709 6.668446699 49 4.043285785 4.850129028 50 3.7314946264.326276882 51 5.679806919 7.238833071 52 5.537331059 5.336870027 536.186200959 6.520273524 54 0.0067999456 0.0299401198

Example 5: PAM Screening for D2S Effector Proteins

D2S effector proteins and guide RNA combinations were screened by invitro enrichment (IVE) for PAM recognition. Effector proteins and guideRNAs were expressed and purified from E. coli. Briefly, effectorproteins were complexed with corresponding guide RNAs for 15 minutes at37° C. The complexes were added to an IVE reaction mix. PAM screeningreactions used 10 μl of RNP in 100 μl reactions with 1,000 ng of a 5′PAM library in 1× Cutsmart buffer and were carried out for 15 minutes at25° C., 45 minutes at 37° C. and 15 minutes at 45° C. Reactions wereterminated with 1 μl of proteinase K and 5 μl of 500 mM EDTA for 30minutes at 37° C. Cis cleavage by each complex was confirmed by gelelectrophoresis. Next generation sequencing was performed on cutsequences to confirm enriched PAMs. The PAM enrichment for the top 5%enrichment (PAM 5% in TABLE 13) generally had lower signal due to morenoise than the 1% (PAM 1% in TABLE 13). In some cases the 1% enrichmentmet the cutoff criteria, but the 5% enrichment did not. In such cases, aPAM is included for the 1% enrichment, but not the 5% enrichment.Complexes (e.g., the composition) and corresponding identified PAMs areprovided in TABLE 13. Additionally, TABLE 13 also shows the effectorprotein Seq ID NO (under Enzyme Seq ID NO), and the cr/sgRNA designationnumber, tracr RNA designation number, and their corresponding sequencesif applicable. Additionally, FIGS. 7A-7E illustrate PAM preferences forthe different D2S effector proteins used in this example. As shown inTABLE 13, the IVE assay revealed the presence of enriched 5′ PAMconsensus sequences for the various D2S effector proteins.

TABLE 13 Compositions for D2S effector protein PAM screening cr/sgRNA #tracrRNA # Comp. Enzyme cr/sgRNA Seq ID cr/ tracrRNA Seq ID Seq ID NO.PAM_1 % PAM_5 % NO. sgRNA NO. PL3314, R4882 NNNKNTK (SEQ ID NO: 310)NNNNNTN (SEQ ID NO: 319) R4882 sgRNA 33 (SEQ ID NO: 151) PL3314, R4887NNNKNTT (SEQ ID NO: 311) NNNNNTN (SEQ ID NO: 319) R4887 SgRNA 33(SEQ ID NO: 150) PL3318, R4845, NNNNTTC (SEQ ID NO: 331)NNNNTNN (SEQ ID NO: 329) R4845 crRNA R5946 R5946 (SEQ ID NO: 53)(Seq ID NO: 372)  8 PL3318, R5938 NNNNYTN (SEQ ID NO: 338)NNNNTYN (SEQ ID NO: 335) R5938 sgRNA  8 (SEQ ID NO: 373) PL3411, R4873,NNNNTTC (SEQ ID NO: 331) NNNNTTN (SEQ ID NO: 332) R4873 crRNA R4945R4945 (SEQ ID NO: 89) (SEQ ID NO: 145) 36 PL3411, R4874,NNNNTTC (SEQ ID NO: 331) NNNNTTY (SEQ ID NO: 333) R4874 crRNA R4928R4928 (SEQ ID NO: 90) (SEQ ID NO: 147) 36 PL3411,R5867NNNTTCN (SEQ ID NO: 351) NNNTTYN (SEQ ID NO: 354) R5867 sgRNA 36(SEQ ID NO: 374) PL3411,R5868 NNNNTTC (SEQ ID NO: 331)NNNNTTY (SEQ ID NO: 333) R5868 sgRNA 36 (SEQ ID NO: 375) PL3411,R5925NNNNTTC (SEQ ID NO: 331) NNNNTTY (SEQ ID NO: 333) R5925 sgRNA 36(SEQ ID NO: 376) PL3412, R4874, NNNNTTY (SEQ ID NO: 333)NNNNYTY (SEQ ID NO: 339) R4874 crRNA R4928 R4928 (SEQ ID NO: 90)(SEQ ID NO: 147) 37 PL3412, R5925 NNNNNTY (SEQ ID NO: 320)NNNNNTY (SEQ ID NO: 320) R5925 sgRNA 37 (SEQ ID NO: 376) PL3412, R5933NNNNTTY (SEQ ID NO: 333) NNNNYTY (SEQ ID NO: 339) R5933 sgRNA 37(SEQ ID NO: 377) PL3413, R4873, NNNNTTC (SEQ ID NO: 331)NNNNTTC (SEQ ID NO: 331) R4873 crRNA R4945 R4945 (SEQ ID NO: 89)(SEQ ID NO: 145) 38 PL3413, R4874, NNNNTTC (SEQ ID NO: 331)NNNNTTC (SEQ ID NO: 331) R4874 crRNA R4928 R4928 (SEQ ID NO: 90)(SEQ ID NO: 147) 38 PL3413, R5867 NNNTTCN (SEQ ID NO: 351)NNNTTCN (SEQ ID NO: 351) R5867 sgRNA 38 (SEQ ID NO: 374) PL3413, R5868NNNNTTC (SEQ ID NO: 331) NNNNTTY (SEQ ID NO: 333) R5868 sgRNA 38(SEQ ID NO: 375) PL3413, R5925 NNNNTTC (SEQ ID NO: 331)NNNNTTC (SEQ ID NO: 331) R5925 sgRNA 38 (SEQ ID NO: 376) PL3413, R5931NNNNTTC (SEQ ID NO: 331) NNNNTTC (SEQ ID NO: 331) R5931 sgRNA 38(SEQ ID NO: 378) PL3413, R5932 NNNNTTC (SEQ ID NO: 331)NNNNTNY (SEQ ID NO: 330) R5932 sgRNA 38 (SEQ ID NO: 379) PL3414, R4873,NNNTYCT (SEQ ID NO: 355) NNNNNCT (SEQ ID NO: 317) R4873 crRNA R4945R4945 (SEQ ID NO: 89) (SEQ ID NO: 145) 39 PL3414, R5867NNNNTYN (SEQ ID NO: 335) NNNNNYN (SEQ ID NO: 321) R5867 sgRNA 39(SEQ ID NO: 374) PL3414, R5868 NNNNNNT (SEQ ID NO: 302)NNNNNYT (SEQ ID NO: 323) R5868 sgRNA 39 (SEQ ID NO: 375) PL3414, R5925NNNNNYT (SEQ ID NO: 323) NNNNNYT (SEQ ID NO: 323) R5925 sgRNA 39(SEQ ID NO: 376) PL3414, R5929 NNNCTTN (SEQ ID NO: 306) R5929 sgRNA 39(SEQ ID NO: 380) PL3414, R5930 NNNTYYT (SEQ ID NO: 359)NNNNNYT (SEQ ID NO: 323) R5930 sgRNA 39 (SEQ ID NO: 381) PL3415, R5867NNNNNYN (SEQ ID NO: 321) NNNNNYN (SEQ ID NO: 321) R5867 sgRNA 40(SEQ ID NO: 374) PL3416, R4873, NNNNNYT (SEQ ID NO: 323)NNNNNNT (SEQ ID NO: 302) R4873 crRNA R4945 R4945 (SEQ ID NO: 89)(SEQ ID NO: 145) 41 PL3416, R4874, NNNNNYT (SEQ ID NO: 323)NNNNNNT (SEQ ID NO: 302) R4874 crRNA R4928 R4928 (SEQ ID NO: 90)(SEQ ID NO: 147) 41 PL3416, R5867 NNNWNCT (SEQ ID NO:NNNNNCT (SEQ ID NO: 317) R5867 sgRNA 41 358) (SEQ ID NO: 374)PL3417, R4873, NNNNTTC (SEQ ID NO: 331) NNNNTTY (SEQ ID NO: 333) R4873crRNA R4945 R4945 (SEQ ID NO: 89) (SEQ ID NO: 145) 42 PL3417, R4874,NNNNTTC (SEQ ID NO: 331) NNNNTTY (SEQ ID NO: 333) R4874 crRNA R4928R4928 (SEQ ID NO: 90) (SEQ ID NO: 147) 42 PL3417, R5867NNNTTTN (SEQ ID NO: 353) NNNTTYN (SEQ ID NO: 354) R5867 sgRNA 42(SEQ ID NO: 374) PL3417, R5868 NNNTYYW (SEQ ID NO:NNNNTYN (SEQ ID NO: 335) R5868 sgRNA 42 357) (SEQ ID NO: 375)PL3417, R5925 NNNTYYN (SEQ ID NO: 356) NNNNTTY (SEQ ID NO: 333) R5925sgRNA 42 (SEQ ID NO: 376) PL3418, R4873, NNNNTTC (SEQ ID NO: 331)NNNNTTC (SEQ ID NO: 331) R4873 crRNA R4945 R4945 (SEQ ID NO: 89)(SEQ ID NO: 145) 43 PL3418, R4874, NNNNTTC (SEQ ID NO: 331)NNNNTTC (SEQ ID NO: 331) R4874 crRNA R4928 R4928 (SEQ ID NO: 90)(SEQ ID NO: 147) 43 PL3418, R5867 NNNTTCN (SEQ ID NO: 351)NNNTTCN (SEQ ID NO: 351) R5867 sgRNA 43 (SEQ ID NO: 374) PL3418, R5868NNNNTTC (SEQ ID NO: 331) NNNNTTC (SEQ ID NO: 331) R5868 sgRNA 43(SEQ ID NO: 375) PL3418, R5925 NNNNTTC (SEQ ID NO: 331)NNNNTTC (SEQ ID NO: 331) R5925 sgRNA 43 (SEQ ID NO: 376) PL4976, R5800NNNNCCR (SEQ ID NO: 313) NNNNCCN (SEQ ID NO: 312) R5800 sgRNA 203 (SEQ ID NO: 382) PL4977, R5726, NNNNCCN (SEQ ID NO: 312)NNNNCCN (SEQ ID NO: 312) R5726 crRNA R5783 R5783 (SEQ ID NO: 383)(SEQ ID NO: 384) 209  PL4977, R5799 NNNNCCN (SEQ ID NO: 312)NNNNCCN (SEQ ID NO: 312) R5799 sgRNA 209  (SEQ ID NO: 385) PL4977, R5800NNNNCCN (SEQ ID NO: 312) NNNNCCN (SEQ ID NO: 312) R5800 sgRNA 209 (SEQ ID NO: 382) PL4977, R5801 NNNNCCR (SEQ ID NO: 313)NNNNCCN (SEQ ID NO: 312) R5801 sgRNA 209  (SEQ ID NO: 386) PL4977, R5802NNNNCCN (SEQ ID NO: 312) NNNNCCN (SEQ ID NO: 312) R5802 sgRNA 209 (SEQ ID NO: 387) PL3302, R5913 R5913 sgRNA  4 (SEQ ID NO: 388)PL3302, R5914 R5914 sgRNA  4 (SEQ ID NO: 389) PL3306, R5935NNNNTNN (SEQ ID NO: 329) R5935 sgRNA  5 (SEQ ID NO: 390) PL3306, R5936NNNNTTY (SEQ ID NO: 333) NNNNTYC (SEQ ID NO: 334) R5936 sgRNA  5(SEQ ID NO: 391) PL3310, R5959 R5959 sgRNA 28 (SEQ ID NO: 392)PL3310, R5960 R5960 sgRNA 28 (SEQ ID NO: 393) PL3310, R5961 R5961 sgRNA28 (SEQ ID NO: 394) PL3310, R5962 R5962 sgRNA 28 (SEQ ID NO: 395)PL3310, R5963 R5963 sgRNA 28 (SEQ ID NO: 396) PL3310, R5964 R5964 sgRNA28 (SEQ ID NO: 397) PL3310, R5965 R5965 sgRNA 28 (SEQ ID NO: 398)PL3310, R5977 R5977 sgRNA 28 (SEQ ID NO: 399) PL3310, R5978 R5978 sgRNA28 (SEQ ID NO: 400) PL3310, R5979 R5979 sgRNA 28 (SEQ ID NO: 401)PL3310, R5980 R5980 sgRNA 28 (SEQ ID NO: 402) PL3319, R4846,NNNGNNN (SEQ ID NO: 307) R4846 crRNA R5947 R5947 (SEQ ID NO: 54)(SEQ ID NO: 403)  9 PL3327, R4879, NNNNCTT (SEQ ID NO: 314)NNNNNTT (SEQ ID NO: 404) R4879 crRNA R4935 R4935 (SEQ ID NO: 405)(SEQ ID NO: 91)  1 PL3327, R5911 R5911 sgRNA  1 (SEQ ID NO: 406)PL3327, R5912 R5912 sgRNA  1 (SEQ ID NO: 407) PL3410, R4873,NNNNTTC (SEQ ID NO: 331) NNNNYTC (SEQ ID NO: 337) R4873 crRNA R4945R4945 (SEQ ID NO: 89) (SEQ ID NO: 145) 35 PL3410, R4874,NNNNTTC (SEQ ID NO: 331) NNNNYWC (SEQ ID NO: R4874 crRNA R4928 R4928340) (SEQ ID NO: 90) (SEQ ID NO: 147) 35 PL3410, R5867NNNNTTC (SEQ ID NO: 331) NNNTNYN (SEQ ID NO: 350) R5867 sgRNA 35(SEQ ID NO: 374) PL3419, R4873, R4873 crRNA R4945 R4945 (SEQ ID NO: 89)(SEQ ID NO: 145) 44 PL3419, R5923 R5923 sgRNA 44 (SEQ ID NO: 408)PL3419, R5924 R5924 sgRNA 44 (SEQ ID NO: 409) PL3420, R5925 R5925 sgRNA45 (SEQ ID NO: 376) PL3420, R5926 R5926 sgRNA 45 (SEQ ID NO: 410)PL3420, R5927 R5927 sgRNA 45 (SEQ ID NO: 411) PL3420, R5928 R5928 sgRNA45 (SEQ ID NO: 412) PL3414, R4873, NNNTYCT (SEQ ID NO: 355)NNNNNCT (SEQ ID NO: 317) R4873 crRNA R4945 R4945 (SEQ ID NO: 89)(SEQ ID NO: 145) 39 PL3414, R5867 NNNNTYN (SEQ ID NO: 335)NNNNNYN (SEQ ID NO: 321) R5867 sgRNA 39 (SEQ ID NO: 374) PL3414, R5868NNNNNNT (SEQ ID NO: 302) NNNNNYT (SEQ ID NO: 323) R5868 sgRNA 39(SEQ ID NO: 375) PL3414, R5925 NNNNNYT (SEQ ID NO: 323)NNNNNYT (SEQ ID NO: 323) R5925 sgRNA 39 (SEQ ID NO: 376) PL3414, R5929NNNCTTN (SEQ ID NO: 306) NNNCTTN (SEQ ID NO: 306) R5929 sgRNA 39(SEQ ID NO: 380) PL3414, R5930 NNNTYYT (SEQ ID NO: 359)NNNNNYT (SEQ ID NO: 323) R5930 sgRNA 39 (SEQ ID NO: 381) PL3327, R4879,NNNNCTT (SEQ ID NO: 314) NNNNNTT (SEQ ID NO: 404) R4879 crRNA R4935R4935 (SEQ ID NO: 405) (SEQ ID NO: 91)  1

Example 6: PAM Screening for D2S Effector Proteins

D2S effector proteins and guide RNA combinations were screened by invitro enrichment (IVE) for PAM recognition. Effector proteins and guideRNAs were expressed and purified from E. coli. Briefly, effectorproteins were complexed with corresponding guide RNAs for 15 minutes at37° C. The complexes were added to an IVE reaction mix. PAM screeningreactions used 10 μl of RNP in 100 μl reactions with 1,000 ng of a 5′PAM library in 1× Cutsmart buffer and were carried out for 15 minutes at25° C., 45 minutes at 37° C. and 15 minutes at 45° C. Reactions wereterminated with 1 μl of proteinase K and 5 μl of 500 mM EDTA for 30minutes at 37° C. Cis cleavage by each complex was confirmed by gelelectrophoresis. Next generation sequencing was performed on cutsequences to confirm enriched PAMs. The PAM enrichment for the top 5%enrichment (PAM 5% in TABLE 14) generally had lower signal due to morenoise than the 1% (PAM 1% in TABLE 14). In some cases, the 1% enrichmentmet the cutoff criteria, but the 5% enrichment did not. In such cases, aPAM is included for the 1% enrichment, but not the 5% enrichment.Complexes (e.g., the composition) and corresponding identified PAMs areprovided in TABLE 14. Additionally, TABLE 14 also shows the effectorprotein Seq ID NO (under Enzyme Seq ID NO), and the cr/sgRNA designationnumber, tracr RNA designation number, and their corresponding sequencesif applicable. FIGS. 7A-7E illustrate PAM preferences for the differentD2S effector proteins used in this example. As shown in TABLE 14, theIVE assay revealed the presence of enriched 5′ PAM consensus sequencesfor the various D2S effector proteins.

TABLE 14 Compositions or D2S effector protein PAM screening cr/sgRNA #tracrRNA# Comp. Enzyme cr/sgRNA Seq ID tracrRNA Seq ID Seq ID NO.PAM_1 % PAM_5 % NO. cr/sgRNA NO. PL4967, R5727,  NNWTTYN (SEQ ID NO:NNNNTYN (SEQ ID NO: 335) R5727 crRNA R5786 R5786 366) (SEQ ID NO: 413)(SEQ ID NO: 414) 204 PL4968, R5728,  NNNTTTN (SEQ ID NO:NNNNTTN (SEQ ID NO: 332) R5728 crRNA R5788 R5788 353) (SEQ ID NO: 415)(SEQ ID NO: 416) 212 PL4970, R5730,  NNWWTTN (SEQ ID NO: R5730 crRNAR5791 R5791 367) (SEQ ID NO: 417) (SEQ ID NO: 418) 232 PL4970, R5730, NNTTTYN (SEQ ID NO: NNNNTYN (SEQ ID NO: 335) R5730 crRNA R5792 R5792365) (SEQ ID NO: 417) (SEQ ID NO: 419) 232 PL4980, R5691, NRNNNNN (SEQ ID NO: R5691 crRNA R5814 R5814 303) (SEQ ID NO: 420)(SEQ ID NO: 421) 218 PL4988, R5697,  NNNNNNG (SEQ ID NO: R5697 crRNAR5831 R5831 301) (SEQ ID NO: 422) (SEQ ID NO: 423) 206 PL4988, R5697, NNNTNTG (SEQ ID NO: NNNNNTG (SEQ ID NO: 318) R5697 crRNA R5847 R5847349) (SEQ ID NO: 422) (SEQ ID NO: 424) 206 PL4988, R5869NNNTNTG (SEQ ID NO: NNNNNTG (SEQ ID NO: 318) R5869 sgRNA 206 349)(SEQ ID NO: 425) PL4988, R5873 NNNTNTG (SEQ ID NO:NNNNNTG (SEQ ID NO: 318) R5873 sgRNA 206 349) (SEQ ID NO: 426)PL4989, R5698,  NNNWNTG (SEQ ID NO: NNNNNTG (SEQ ID NO: 318) R5698 crRNAR5832 R5832 360) (SEQ ID NO: 427) (SEQ ID NO: 428) 221 PL4989, R5698, NNNWNTG (SEQ ID NO: NNNNNTG (SEQ ID NO: 318) R5698 crRNA R5848 R5848360) (SEQ ID NO: 427) (SEQ ID NO: 429) 221 PL4990, R5699, NNNTNTG (SEQ ID NO: NNNNNYR (SEQ ID NO: 322) R5699 crRNA R5833 R5833349) (SEQ ID NO: 430) (SEQ ID NO: 431) 228 PL4990, R5699, NNNWYTG (SEQ ID NO: NNNNNTG (SEQ ID NO: 318) R5699 crRNA R5833 R5849361) (SEQ ID NO: 430) (SEQ ID NO: 431) 228 PL4990, R5699NNNWNTG (SEQ ID NO: NNNNNTG (SEQ ID NO: 318) R5699 crRNA R5849 228 360)(SEQ ID NO: 430) (SEQ ID NO: 432) PL4990, R5870 NNNWYTG (SEQ ID NO:NNNNNTG (SEQ ID NO: 318) R5870 sgRNA 228 361) (SEQ ID NO: 433)PL4990, R5874 NNNWYTG (SEQ ID NO: NNNNNTG (SEQ ID NO: 318) R5874 sgRNA228 361) (SEQ ID NO: 434) PL4991, R5700,  NNNTNTG (SEQ ID NO:NNNNNTG (SEQ ID NO: 318) R5700 crRNA R5834 R5834 349) (SEQ ID NO: 435)(SEQ ID NO: 436) 233 PL4991, R5700,  NNNWNTG (SEQ ID NO:NNNNNTG (SEQ ID NO: 318) R5700 crRNA R5850 R5850 360) (SEQ ID NO: 435)(SEQ ID NO: 437) 233 PL4992, R5702,  NNNRTRG (SEQ ID NO:NNNNNNG (SEQ ID NO: 301) R5702 crRNA R5846 R5846 343) (SEQ ID NO: 438)(SEQ ID NO: 439) 240 PL4992, R5702,  NNNRTRG (SEQ ID NO:NNNNNNG (SEQ ID NO: 301) R5702 crRNA R5861 R5861 343) (SEQ ID NO: 438)(SEQ ID NO: 440) 240 PL4994, R5835 NNKRTTN (SEQ ID NO:NNNNTTN (SEQ ID NO: 332) R5835 sgRNA 202 305) (SEQ ID NO: 441)PL4994, R5851 NNKRTTN (SEQ ID NO: R5851 sgRNA 202 305) (SEQ ID NO: 442)PL4995, R5836 NNNRTTN (SEQ ID NO: NNNRTTN (SEQ ID NO: 345) R5836 sgRNA205 345) (SEQ ID NO: 443) PL4995, R5852 NNNRTTN (SEQ ID NO:NNNRTTN (SEQ ID NO: 345) R5852 sgRNA 205 345) (SEQ ID NO: 444)PL4997, R5838 NNNRTWG (SEQ ID NO: NNNRTTG (SEQ ID NO: 344) R5838 sgRNA208 346) (SEQ ID NO: 445) PL4997, R5854 NNNRTWG (SEQ ID NO:NNNRTTG (SEQ ID NO: 344) R5854 sgRNA 208 346) (SEQ ID NO: 446)PL4998, R5871 NNRGTYG (SEQ ID NO: NNNGTYN (SEQ ID NO: 309) R5871 sgRNA213 363) (SEQ ID NO: 447) PL4998, R5876 NNNGTYG (SEQ ID NO:NNNGTYN (SEQ ID NO: 309) R5876 sgRNA 213 308) (SEQ ID NO: 448)PL4999, R5840 NNNRTNG (SEQ ID NO: NNNRNNG (SEQ ID NO: 341) R5840 sgRNA216 342) (SEQ ID NO: 449) PL4999, R5855 NNNRTNG (SEQ ID NO:NNNRNNG (SEQ ID NO: 341) R5855 sgRNA 216 342) (SEQ ID NO: 450)PL5000, R5841 NNNRTTN (SEQ ID NO: NNNRTTN (SEQ ID NO: 345) R5841 sgRNA217 345) (SEQ ID NO: 451) PL5000, R5856 NNNRTTN (SEQ ID NO:NNNRTTN (SEQ ID NO: 345) R5856 sgRNA 217 345) (SEQ ID NO: 452)PL5001, R5842 NNNTNCG (SEQ ID NO: NNNNNCG (SEQ ID NO: 316) R5842 sgRNA220 348) (SEQ ID NO: 453) PL5001, R5842 NNNTKCG (SEQ ID NO:NNNNNCG (SEQ ID NO: 316) R5842 sgRNA 220 347) (SEQ ID NO: 453)PL5001, R5857 NNNTKCG (SEQ ID NO: NNNNNCG (SEQ ID NO: 316) R5857 sgRNA220 347) (SEQ ID NO: 454) PL5002, R5843 NNNRTRG (SEQ ID NO: R5843 sgRNA225 343) (SEQ ID NO: 455) PL5002, R5858 NNNRTRG (SEQ ID NO:NNNNNNG (SEQ ID NO: 301) R5858 sgRNA 225 343) (SEQ ID NO: 456)PL5003, R5844 NNNNTCG (SEQ ID NO: NNNNNCG (SEQ ID NO: 316) R5844 sgRNA229 325) (SEQ ID NO: 457) PL5003, R5859 NNNNTCG (SEQ ID NO:NNNNNCG (SEQ ID NO: 316) R5859 sgRNA 229 325) (SEQ ID NO: 458)PL5004, R5683,  NNNYTTR (SEQ ID NO: NNNNTYR (SEQ ID NO: 336) R5683 crRNAR5807 R5807 362) (SEQ ID NO: 459) (SEQ ID NO: 460) 210 PL5004, R5867NNNTTYN (SEQ ID NO: NNNNNYN (SEQ ID NO: 321) R5867 sgRNA 210 354)(SEQ ID NO: 374) PL5005, R5684,  NNNNTTC (SEQ ID NO: R5684 crRNA R5808R5808 331) (SEQ ID NO: 461) (SEQ ID NO: 462) 234 PL5005, R5868NNNTTNY (SEQ ID NO: NNNTTNY (SEQ ID NO: 352) R5868 sgRNA 234 352)(SEQ ID NO: 375) PL3302, R5913 NNNNTTC (SEQ ID NO:NNNNTTC (SEQ ID NO: 331) R5913 sgRNA 4 331) (SEQ ID NO: 388)PL3302, R5913 NNNNTTC (SEQ ID NO: NNNNTTC (SEQ ID NO: 331) R5913 sgRNA 4331) (SEQ ID NO: 388) PL3420, R5926 NNNNNTY (SEQ ID NO:NNNNTTY (SEQ ID NO: 333) R5926 sgRNA 45 320) (SEQ ID NO: 411)

Example 7: DETECTR Activity of D2S Effector Proteins

D2S effector proteins were tested for trans cleavage. Briefly, partiallypurified (nickel-NTA purified) D2S effector proteins were incubated withcrRNA and tracrRNA or sgRNAs in a trans cleavage buffer (20 mM Tricine,15 mM MgCl2, 0.2 mg/ml BSA, 1 mM TCEP (pH 9 at 37° C.) at roomtemperature for 20 minutes, followed by addition of target nucleic acidat a final concentration of 10 nM to produce effector-protein guidecomplexes. The components of the effector-protein guide complexes thatwere assayed are provided in TABLE 15, which shows the composition ofeach experiment, the effector Enzyme SEQ ID NO, and the cr/sgRNAdesignation number, tracr RNA designation number, and theircorresponding sequences if applicable. Trans cleavage activity wasdetected by fluorescence signal upon cleavage of a fluorophore-quencherreporter (200 nM) in a DETECTR reaction, fluorescence activity is shownunder FC max rate in TABLE 15, which indicates the maximum rate offluorescence generated over the course of the DETECTR reaction.Dilutions were of the effector-protein guide complexes were performed,and the assay repeated at 1%, 0.1% or 0.01% of the original proteinconcentration. The dilution that provided the highest signal ratio islisted in TABLE 15.

TABLE 15 Compositions for D2S effector protein PAM screening FC max PEnzyme cr/sgRNA tracrRNA Composition rate value Seq ID NO sequencecr/sgRNA sequence PL5006, 2 0.025 223 R5804 (SEQ ID sgRNA R5804 NO: 463)PL5007, 2.5 0.016 224 R5705 (SEQ ID crRNA R5875 (SEQ ID R5705, NO: 464)NO: 465) R5875 PL5022, 2.24 0.040 214 R5772 (SEQ ID sgRNA R5772 NO: 466)

Example 8: D2S Enzyme Edit Genomic DNA in Mammalian Cells

D2S effectors were tested for their ability to produce indels in HEK293Tcells. Briefly, 150 ng nuclease and 150 ng gRNA carrying plasmids weredelivered by lipofection to HEK293T cells in 96 well plates. TransIT-293reagent was diluted with warmed up OPTIMEM and mixed with the plasmidDNA at the ratio of 2:1 lipid:DNA. Lipid:DNA mixture were incubated for10 minutes at room temperature before adding 20 μL of the lipid:DNAoptimem mixture to each well. Cells were incubated for 3 days beforebeing lysed and subjected to PCR amplification. TABLE 16 shows theconstructs (e.g., composition) test and their indel percent in HEK293Tcells. Additionally, TABLE 16 also shows the PAM 1% enrichment sequence,the effector protein Seq ID NO (under Enzyme Seq ID NO), and the sgRNAsequence if applicable.

Indels were detected by next generation sequencing of PCR amplicons atthe targeted loci and indel percentage was calculated as the fraction ofsequencing reads containing insertions or deletions relative to anunedited reference sequence. Sequencing libraries with less than 20% ofreads aligning to the reference sequence were excluded from the analysisfor quality control purposes. “No plasmid” included as negativerespectively. TABLE 16 shows the results of this experiment. The resultsin TABLE 16 show the D2S enzymes had nuclease activity.

TABLE 16 Indels by D2S effectors Composition Indel sgRNA SEQ IDEnzyme SEQ ID NO: PAM 1% percent NO: PL5614, PL6521NTCG (SEQ ID NO: 369) 0.11 SEQ ID NO: 180 202 PL5614, PL6522RTTR (SEQ ID NO: 370) 0.14 SEQ ID NO: 467 202 PL5616, PL6522RTTR (SEQ ID NO: 370) 0.74 SEQ ID NO: 467 208 PL5618, PL6522RTTR (SEQ ID NO: 370) 1.70 SEQ ID NO: 467 25 PL5619, PL6522RTTR (SEQ ID NO: 370) 5.09 SEQ ID NO: 467 28 PL5620, PL6522RTTR (SEQ ID NO: 370) 0.46 SEQ ID NO: 467 217 PL5621, PL6522RTTR (SEQ ID NO: 370) 3.89 SEQ ID NO: 467 219 PL5622, PL6521NTCG (SEQ ID NO: 369) 1.58 SEQ ID NO: 180 236 PL5622, PL6522RTTR (SEQ ID NO: 370) 1.36 SEQ ID NO: 467 236 PL5623, PL6522RTTR (SEQ ID NO: 370) 1.04 SEQ ID NO: 467 237 PL5624, PL6522RTTR (SEQ ID NO: 370) 0.13 SEQ ID NO: 467 29 PL5625, PL6521NTCG (SEQ ID NO: 369) 0.33 SEQ ID NO: 180 30 PL5627, PL6521NTCG (SEQ ID NO: 369) 0.86 SEQ ID NO: 180 32

Example 9: D2S Enzyme Edit Genomic DNA in Mammalian Cells

Enzymes were tested for their ability to produce indels in HEK293Tcells. Briefly, plasmids encoding the enzymes and guide RNAs weredelivered by lipofection to HEK293T cells. Cells were incubated forapproximately 48 hours before being lysed. Indels were detected by nextgeneration sequencing of PCR amplicons at the targeted loci and indelpercentage was calculated as the fraction of sequencing reads containinginsertions or deletions relative to an unedited reference sequence.Sequencing libraries with less than 20% of reads aligning to thereference sequence were excluded from the analysis for quality controlpurposes. “No plasmid” and SpyCas9 were included as negative andpositive controls, respectively. TABLE 17 describes the sequences of thesingle guide RNAs tested and percent of reads with indels. Additionally,TABLE 17 shows the compositions tested, the PAM 1% enrichment sequence,the effector protein Seq ID NO (under Enzyme Seq ID NO), and the sgRNAsequence if applicable. The results in TABLE 17 show the D2S enzymes hadnuclease activity.

TABLE 17 Indels by D2S effectors Comp. Indel sgRNA SEQ IDEnzyme SEQ ID NO: PAM 1% percent NO: PL5995 TNTG (SEQ ID NO: 368) 21.36SEQ ID NO: 468 228 PL7302 NTCG (SEQ ID NO: 369) 7.90 SEQ ID NO: 469 238PL7319 NTCG (SEQ ID NO: 369) 6.94 SEQ ID NO: 470 238 PL7303NTCG (SEQ ID NO: 369) 1.44 SEQ ID NO: 471 238 PL7309NTCG (SEQ ID NO: 369) 1.37 SEQ ID NO: 472 238 PL6239NTTC (SEQ ID NO: 371) 1.43 SEQ ID NO: 473 45 PL6246NTTC (SEQ ID NO: 371) 0.90 SEQ ID NO: 473 45 PL6243NTTC (SEQ ID NO: 371) 0.29 SEQ ID NO: 474 45 PL6237NTTC (SEQ ID NO: 371) 0.21 SEQ ID NO: 475 45 PL7375RTTR (SEQ ID NO: 370) 0.70 SEQ ID NO: 476 30 PL6412NTTC (SEQ ID NO: 371) 0.95 SEQ ID NO: 477 38 PL6414NTTC (SEQ ID NO: 371) 0.70 SEQ ID NO: 478 38 PL6417NTTC (SEQ ID NO: 371) 0.13 SEQ ID NO: 479 38 PL7399RTTR (SEQ ID NO: 370) 0.70 SEQ ID NO: 476 229 PL7420RTTR (SEQ ID NO: 370) 0.60 SEQ ID NO: 480 229 PL7328RTTR (SEQ ID NO: 370) 0.69 SEQ ID NO: 481 222

Example 10: CasM19952 Edits Genomic DNA in Mammalian Cells with MultiplesgRNA

D2S effectors were tested for their ability to produce indels in HEK293Tcells. Briefly, 150 ng nuclease and 150 ng gRNA carrying plasmids weredelivered by lipofection to HEK293T cells in 96 well plates. TransIT-293reagent was diluted with warmed up OPTIMEM and mixed with the plasmidDNA at the ratio of 2:1 lipid:DNA. Lipid:DNA mixture were incubated for10 minutes at room temperature before adding 20 μL of the lipid:DNAoptimem mixture to each well. Cells were incubated for 3 days beforebeing lysed and subjected to PCR amplification. TABLE 18 shows theconstructs (e.g., composition) test and their indel percent in HEK293Tcells. Additionally, TABLE 18 also shows the effector protein Seq ID NO(under Enzyme Seq ID NO), and the sgRNA sequence if applicable. The PAM1% enrichment sequence for this experiment was NTCG (SEQ ID NO: 369).

Indels were detected by next generation sequencing of PCR amplicons atthe targeted loci and indel percentage was calculated as the fraction ofsequencing reads containing insertions or deletions relative to anunedited reference sequence. Sequencing libraries with less than 20% ofreads aligning to the reference sequence were excluded from the analysisfor quality control purposes. “No plasmid” and SpyCas9 were included asnegative and positive controls, respectively. TABLE 18 shows the resultsof this experiment. The results in TABLE 18 show the D2S enzymes hadnuclease activity.

TABLE 18 Indels by CasM19952 Comp. Enzyme SEQ ID NO: Indel percent sgRNASEQ ID NO: PL5879, PL3651 0.104 SEQ ID NO: 482 23 PL5876, PL3651 0.111SEQ ID NO: 483 23 PL5680, PL3651 0.111 SEQ ID NO: 484 23 PL5680, PL36510.111 SEQ ID NO: 484 23 PL5691, PL3651 0.120 SEQ ID NO: 485 23 PL5680,PL3651 0.122 SEQ ID NO: 484 23 PL5674, PL3651 0.125 SEQ ID NO: 486 23PL5873, PL3651 0.133 SEQ ID NO: 487 23 PL5670, PL3651 0.138 SEQ ID NO:488 23 PL5874, PL3651 0.140 SEQ ID NO: 489 23 PL5690, PL3651 0.142 SEQID NO: 490 23 PL5688, PL3651 0.142 SEQ ID NO: 491 23 PL5679, PL36510.149 SEQ ID NO: 492 23 PL5668, PL3651 0.153 SEQ ID NO: 493 23 PL5682,PL3651 0.161 SEQ ID NO: 494 23 PL5685, PL3651 0.162 SEQ ID NO: 495 23PL5682, PL3651 0.177 SEQ ID NO: 494 23 PL5878, PL3651 0.182 SEQ ID NO:496 23 PL5875, PL3651 0.186 SEQ ID NO: 497 23 PL5873, PL3651 0.190 SEQID NO: 487 23 PL5690, PL3651 0.200 SEQ ID NO: 490 23 PL5690, PL36510.225 SEQ ID NO: 490 23 PL5875, PL3651 0.231 SEQ ID NO: 497 23 PL5686,PL3651 0.241 SEQ ID NO: 498 23 PL5678, PL3651 0.245 SEQ ID NO: 499 23PL5685, PL3651 0.270 SEQ ID NO: 495 23 PL5679, PL3651 0.276 SEQ ID NO:492 23 PL5877, PL3651 0.298 SEQ ID NO: 500 23 PL5689, PL3651 0.315 SEQID NO: 501 23 PL5875, PL3651 0.326 SEQ ID NO: 497 23 PL5685, PL36510.343 SEQ ID NO: 495 23 PL5877, PL3651 0.355 SEQ ID NO: 500 23 PL5877,PL3651 0.367 SEQ ID NO: 500 23 PL5880, PL3651 0.409 SEQ ID NO: 502 23PL5689, PL3651 0.421 SEQ ID NO: 501 23 PL5880, PL3651 0.440 SEQ ID NO:502 23 PL5682, PL3651 0.448 SEQ ID NO: 494 23 PL5881, PL3651 0.450 SEQID NO: 503 23 PL5689, PL3651 0.453 SEQ ID NO: 501 23 PL5669, PL36510.467 SEQ ID NO: 504 23 PL5694, PL3651 0.520 SEQ ID NO: 505 23 PL5881,PL3651 0.601 SEQ ID NO: 503 23 PL5669, PL3651 0.617 SEQ ID NO: 504 23PL5694, PL3651 0.639 SEQ ID NO: 505 23 PL5881, PL3651 0.656 SEQ ID NO:503 23 PL5683, PL3651 0.658 SEQ ID NO: 506 23 PL5683, PL3651 0.665 SEQID NO: 506 23 PL5673, PL3651 0.669 SEQ ID NO: 507 23 PL5693, PL36510.681 SEQ ID NO: 508 23 PL5673, PL3651 0.681 SEQ ID NO: 507 23 PL5694,PL3651 0.684 SEQ ID NO: 505 23 PL5684, PL3651 0.704 SEQ ID NO: 509 23PL5683, PL3651 0.710 SEQ ID NO: 506 23 PL5669, PL3651 0.713 SEQ ID NO:504 23 PL5681, PL3651 0.723 SEQ ID NO: 510 23 PL5673, PL3651 0.736 SEQID NO: 507 23 PL5681, PL3651 0.738 SEQ ID NO: 510 23 PL5671, PL36510.748 SEQ ID NO: 511 23 PL5684, PL3651 0.761 SEQ ID NO: 509 23 PL5671,PL3651 0.800 SEQ ID NO: 511 23 PL5681, PL3651 0.850 SEQ ID NO: 510 23PL5693, PL3651 0.924 SEQ ID NO: 508 23 PL5671, PL3651 0.945 SEQ ID NO:511 23 PL5684, PL3651 1.041 SEQ ID NO: 509 23 PL5693, PL3651 1.053 SEQID NO: 508 23 PL5880, PL3651 1.513 SEQ ID NO: 502 23 PL5677, PL36512.340 SEQ ID NO: 512 23 PL5677, PL3651 2.377 SEQ ID NO: 512 23 PL5677,PL3651 2.613 SEQ ID NO: 512 23 PL5672, PL3651 2.630 SEQ ID NO: 513 23PL5672, PL3651 2.861 SEQ ID NO: 513 23 PL5672, PL3651 3.629 SEQ ID NO:513 23 PL5687, PL3651 4.047 SEQ ID NO: 514 23 PL5687, PL3651 4.083 SEQID NO: 514 23 PL5687, PL3651 4.211 SEQ ID NO: 514 23 PL5785, PL36514.762 SEQ ID NO: 515 23 PL5857, PL3651 8.796 SEQ ID NO: 516 23 PL5857,PL3651 8.869 SEQ ID NO: 516 23 PL5857, PL3651 9.317 SEQ ID NO: 516 23PL5869, PL3651 10.779 SEQ ID NO: 517 23 PL5869, PL3651 11.648 SEQ ID NO:517 23 PL5869, PL3651 11.715 SEQ ID NO: 517 23 PL5809, PL3651 12.082 SEQID NO: 518 23 PL5809, PL3651 12.323 SEQ ID NO: 518 23 PL5746, PL365112.385 SEQ ID NO: 519 23 PL5785 , PL3651 12.772 SEQ ID NO: 515 23PL5746, PL3651 12.795 SEQ ID NO: 519 23 PL5821, PL3651 13.028 SEQ ID NO:520 23 PL5675, PL3651 13.042 SEQ ID NO: 521 23 PL5695, PL3651 13.171 SEQID NO: 522 23 PL5809, PL3651 13.360 SEQ ID NO: 518 23 PL5695, PL365113.374 SEQ ID NO: 522 23 PL5785, PL3651 13.415 SEQ ID NO: 515 23 PL5675,PL3651 13.541 SEQ ID NO: 521 23 PL5695, PL3651 13.558 SEQ ID NO: 522 23PL5696, PL3651 13.690 SEQ ID NO: 523 23 PL5675, PL3651 13.691 SEQ ID NO:521 23 PL5821, PL3651 13.959 SEQ ID NO: 520 23 PL5821, PL3651 14.008 SEQID NO: 520 23 PL5696, PL3651 14.387 SEQ ID NO: 523 23 PL5696, PL365114.427 SEQ ID NO: 523 23 PL5746, PL3651 14.455 SEQ ID NO: 519 23 PL5813,PL3651 14.671 SEQ ID NO: 524 23 PL5788, PL3651 14.932 SEQ ID NO: 525 23PL5788, PL3651 14.947 SEQ ID NO: 525 23 PL5788, PL3651 15.031 SEQ ID NO:525 23 PL5743, PL3651 15.306 SEQ ID NO: 526 23 PL5817, PL3651 15.431 SEQID NO: 527 23 PL5787, PL3651 15.780 SEQ ID NO: 528 23 PL5825, PL365115.781 SEQ ID NO: 529 23 PL5745, PL3651 16.012 SEQ ID NO: 530 23 PL5825,PL3651 16.080 SEQ ID NO: 529 23 PL5787, PL3651 16.133 SEQ ID NO: 528 23PL5745, PL3651 16.234 SEQ ID NO: 530 23 PL5813, PL3651 16.242 SEQ ID NO:524 23 PL5787, PL3651 16.243 SEQ ID NO: 528 23 PL5813, PL3651 16.299 SEQID NO: 524 23 PL5745, PL3651 16.379 SEQ ID NO: 530 23 PL5817, PL365116.437 SEQ ID NO: 527 23 PL5825, PL3651 17.232 SEQ ID NO: 529 23 PL5837,PL3651 17.270 SEQ ID NO: 531 23 PL5748, PL3651 17.325 SEQ ID NO: 532 23PL5697, PL3651 17.376 SEQ ID NO: 533 23 PL5748, PL3651 17.397 SEQ ID NO:532 23 PL5841, PL3651 17.403 SEQ ID NO: 534 23 PL5737, PL3651 17.410 SEQID NO: 180 23 PL5740, PL3651 17.422 SEQ ID NO: 535 23 PL5739, PL365117.476 SEQ ID NO: 536 23 PL5739, PL3651 17.507 SEQ ID NO: 536 23 PL5739,PL3651 17.567 SEQ ID NO: 536 23 PL5744, PL3651 17.667 SEQ ID NO: 537 23PL5817, PL3651 17.743 SEQ ID NO: 527 23 PL5740, PL3651 17.800 SEQ ID NO:535 23 PL5742, PL3651 17.891 SEQ ID NO: 538 23 PL5737, PL3651 17.985 SEQID NO: 180 23 PL5697, PL3651 18.004 SEQ ID NO: 533 23 PL5740, PL365118.009 SEQ ID NO: 535 23 PL5845, PL3651 18.138 SEQ ID NO: 539 23 PL5744,PL3651 18.142 SEQ ID NO: 537 23 PL5743, PL3651 18.158 SEQ ID NO: 526 23PL5789, PL3651 18.162 SEQ ID NO: 540 23 PL5829, PL3651 18.319 SEQ ID NO:541 23 PL5743, PL3651 18.573 SEQ ID NO: 526 23 PL5829, PL3651 18.654 SEQID NO: 541 23 PL5738, PL3651 18.716 SEQ ID NO: 542 23 PL5845, PL365118.796 SEQ ID NO: 539 23 PL5837, PL3651 18.832 SEQ ID NO: 531 23 PL5829,PL3651 18.903 SEQ ID NO: 541 23 PL5697, PL3651 18.935 SEQ ID NO: 533 23PL5744, PL3651 19.177 SEQ ID NO: 537 23 PL5790, PL3651 19.269 SEQ ID NO:543 23 PL5837 , PL3651 19.359 SEQ ID NO: 531 23 PL5738, PL3651 19.376SEQ ID NO: 542 23 PL5738, PL3651 19.393 SEQ ID NO: 542 23 PL5737, PL365119.431 SEQ ID NO: 180 23 PL5789, PL3651 19.438 SEQ ID NO: 540 23 PL5841,PL3651 19.445 SEQ ID NO: 534 23 PL5845, PL3651 19.518 SEQ ID NO: 539 23PL5742, PL3651 19.719 SEQ ID NO: 538 23 PL5841, PL3651 19.736 SEQ ID NO:534 23 PL5833, PL3651 19.841 SEQ ID NO: 544 23 PL5790, PL3651 19.903 SEQID NO: 543 23 PL5747, PL3651 20.101 SEQ ID NO: 545 23 PL5747, PL365120.142 SEQ ID NO: 545 23 PL5849, PL3651 20.220 SEQ ID NO: 546 23 PL5742,PL3651 20.326 SEQ ID NO: 538 23 PL5790, PL3651 20.725 SEQ ID NO: 543 23PL5789, PL3651 21.113 SEQ ID NO: 540 23 PL5833, PL3651 21.632 SEQ ID NO:544 23 PL5748, PL3651 21.703 SEQ ID NO: 532 23 PL5833, PL3651 21.746 SEQID NO: 544 23 PL5786, PL3651 21.806 SEQ ID NO: 547 23 PL5849, PL365121.858 SEQ ID NO: 546 23 PL5747, PL3651 21.953 SEQ ID NO: 545 23 PL5849,PL3651 22.178 SEQ ID NO: 546 23 PL5786, PL3651 22.673 SEQ ID NO: 547 23PL5786, PL3651 22.987 SEQ ID NO: 547 23 PL5741, PL3651 24.052 SEQ ID NO:548 23 PL5741, PL3651 24.284 SEQ ID NO: 548 23 PL5741, PL3651 24.376 SEQID NO: 548 23 PL5853, PL3651 26.455 SEQ ID NO: 549 23 PL5853, PL365127.069 SEQ ID NO: 549 23 PL5853, PL3651 28.127 SEQ ID NO: 549 23

Example 11: D2S Enzyme Edit Genomic DNA in Mammalian Cells

D2S effectors were tested for their ability to produce indels in HEK293Tcells. Briefly, 300 ng of plasmids expressing effector and transcribingtargeting gRNA were delivered by lipofection to HEK293T cells in 96 wellplates. TransIT-293 reagent was diluted with warmed up OPTIMEM and mixedwith the plasmid DNA at the ratio of 2:1 lipid:DNA. Lipid:DNA mixturewere incubated for 10 minutes at room temperature before adding 20 μL ofthe lipid:DNA optimem mixture to each well. Cells were incubated for 3days before being lysed and subjected to PCR amplification. TABLE 19shows the constructs (e.g., composition) test and their indel percent inHEK293T cells. Additionally, TABLE 19 also shows the effector proteinSeq ID NO (under Enzyme Seq ID NO), and the crRNA or sgRNA sequence ifapplicable.

Indels were detected by next generation sequencing of PCR amplicons atthe targeted loci and indel percentage was calculated as the fraction ofsequencing reads containing insertions or deletions relative to anunedited reference sequence. Sequencing libraries with less than 20% ofreads aligning to the reference sequence were excluded from the analysisfor quality control purposes. “No plasmid” and SpyCas9 were included asnegative and positive controls, respectively. TABLE 19 shows the resultsof this experiment. The results in TABLE 19 show the D2S enzymes hadnuclease activity.

TABLE 19 Indels by D2S effectors Comp. Enzyme SEQ Indel cr/sgRNA SEQ IDID NO: percent NO: cr/sgRNA tracrRNA if applicable PL4891 0.498SEQ ID NO: 550 crRNA in CGAUUCCUCCCUACAGUAG 211  plasmidUUAGGUAUAGCCGAAAGGU PL4895 0.149 SEQ ID NO: 551 crRNA inAGAGACUAAAUCUGUAGUU 211  plasmid GGAGUGGGCCGCUUGCAUC PL4904 1.028SEQ ID NO: 552 crRNA in GGCCUAAAGUUGAGAAGUG 211  plasmidUCAGACUCUGAUAACCCUC PL4907 0.500 SEQ ID NO: 553 crRNA inAACGACGAUAUUCUUUAUU 211  plasmid UCGGUUCAAAGUUCUGCAC PL4908 0.198SEQ ID NO: 554 crRNA in AAAACAGGUGAGUCCUUAU 211  plasmidAAACCGGUGUGCAGAACG PL4909 0.965 SEQ ID NO: 555 crRNA in (SEQ ID NO: 938)211  plasmid PL4915 0.639 SEQ ID NO: 550 crRNA in CGAUUCCUCCCUACAGUAG230 plasmid AGAGACUAAAUCUGUAGUU PL4919 0.159 SEQ ID NO: 551 crRNA inGGAGUGGGCCGCUUGCAUC 230 plasmid UUAGGUAUAGCCGAAAGGU PL4932 0.185SEQ ID NO: 554 crRNA in GGCCUAAAGUUGAGAAGUG 230 plasmidUCAGACUCUGAUAACCCUC AACGACGAUAUUCUUUAUU UCGGUUCAAAGUUCUGCACAAAACAGGUGAGUCCUUAU AAACCGGUGUGCAGAACG (SEQ ID NO: 939) PL4942 0.260SEQ ID NO: 556 crRNA in CGAUUCCUCCCUACAGUAG 226 plasmidUUAGGUAUAGCCGAAAGGU PL4952 0.167 SEQ ID NO: 557 crRNA inAGAGACUAAAUCUGUAGUU 226 plasmid GGAGUGGGCCGCUUGCAUC GGCCUAAAGUUGAGAAGUGUCAGACUCUGAUAACCCUC AACGACGAUAUUCUUUAUU UCGGUUCAAAGUUCUGCACAAAACAGGUGAGUCCUUAU AAACCGGUGUGCAGAACG (SEQ ID NO: 940) PL4293 0.327SEQ ID NO: 558 sgRNA in N/A 22 plasmid PL4295 0.426 SEQ ID NO: 559sgRNA in N/A 22 plasmid PL4296 0.133 SEQ ID NO: 181 sgRNA in N/A 22plasmid PL4298 0.117 SEQ ID NO: 182 sgRNA in N/A 22 plasmid PL4304 3.592SEQ ID NO: 184 sgRNA in N/A 22 plasmid PL4305 0.467 SEQ ID NO: 560sgRNA in N/A 22 plasmid PL4308 0.105 SEQ ID NO: 561 sgRNA in N/A 22plasmid PL4309 0.916 SEQ ID NO: 185 sgRNA in N/A 22 plasmid PL4341 0.172SEQ ID NO: 562 sgRNA in N/A 24 plasmid PL4342 0.197 SEQ ID NO: 563sgRNA in N/A 24 plasmid PL4343 1.157 SEQ ID NO: 564 sgRNA in N/A 24plasmid PL4345 1.441 SEQ ID NO: 565 sgRNA in N/A 24 plasmid PL4346 0.101SEQ ID NO: 566 sgRNA in N/A 24 plasmid PL4352 0.102 SEQ ID NO: 567sgRNA in N/A 24 plasmid PL4353 0.260 SEQ ID NO: 568 sgRNA in N/A 24plasmid PL4356 0.166 SEQ ID NO: 569 sgRNA in N/A 24 plasmid PL4358 0.182SEQ ID NO: 570 sgRNA in N/A 25 plasmid PL4360 0.662 SEQ ID NO: 481sgRNA in N/A 25 plasmid PL4375 9.193 SEQ ID NO: 571 sgRNA in N/A 25plasmid PL4378 0.550 SEQ ID NO: 572 sgRNA in N/A 25 plasmid PL4381 0.970SEQ ID NO: 573 sgRNA in N/A 25 plasmid PL4389 0.160 SEQ ID NO: 558sgRNA in N/A 26 plasmid PL4391 0.373 SEQ ID NO: 559 sgRNA in N/A 26plasmid PL4404 0.193 SEQ ID NO: 561 sgRNA in N/A 26 plasmid PL4406 0.238SEQ ID NO: 574 sgRNA in N/A 28 plasmid PL4408 0.783 SEQ ID NO: 575sgRNA in N/A 28 plasmid PL4417 0.131 SEQ ID NO: 576 sgRNA in N/A 28plasmid PL4426 0.639 SEQ ID NO: 577 sgRNA in N/A 28 plasmid PL4427 0.247SEQ ID NO: 578 sgRNA in N/A 28 plasmid PL4434 0.889 SEQ ID NO: 579sgRNA in N/A 29 plasmid PL4453 0.106 SEQ ID NO: 580 sgRNA in N/A 29plasmid PL4454 0.271 SEQ ID NO: 570 sgRNA in N/A 31 plasmid PL4456 0.822SEQ ID NO: 481 sgRNA in N/A 31 plasmid PL4474 0.560 SEQ ID NO: 572sgRNA in N/A 31 plasmid PL4477 0.756 SEQ ID NO: 573 sgRNA in N/A 31plasmid PL4486 0.156 SEQ ID NO: 581 sgRNA in N/A 32 plasmid PL4487 0.299SEQ ID NO: 582 sgRNA in N/A 32 plasmid PL4488 0.260 SEQ ID NO: 583sgRNA in N/A 32 plasmid PL4497 0.316 SEQ ID NO: 584 sgRNA in N/A 32plasmid PL4500 0.409 SEQ ID NO: 585 sgRNA in N/A 32 plasmid PL4501 0.364SEQ ID NO: 586 sgRNA in N/A 32 plasmid PL4510 0.116 SEQ ID NO: 581sgRNA in N/A 30 plasmid PL4513 0.825 SEQ ID NO: 587 sgRNA in N/A 30plasmid PL4520 0.338 SEQ ID NO: 588 sgRNA in N/A 30 plasmid PL4524 0.241SEQ ID NO: 585 sgRNA in N/A 30 plasmid PL4670 0.191 SEQ ID NO: 574sgRNA in N/A 34 plasmid PL4699 0.239 SEQ ID NO: 589 crRNA inGAAGGCCGACCUGUACGGC 15 plasmid CUUAAGGUUGAGAAGGCAC PL4700 0.219SEQ ID NO: 590 crRNA in AUGUAAGUGGAAAAAUGCU 15 plasmidCCAAGCACACACGUUUUUU PL4701 0.230 SEQ ID NO: 591 crRNA inUUCCCGUUGUGUUCGCUCA 15 plasmid U (SEQ ID NO: 107) PL4751 0.122SEQ ID NO: 592 crRNA in AUAUUAAGGGCGGCUCAGC 44 plasmidGUCCUUAAGUCGAGAAAGU AUACAUAAAUUUCUUAUAU AGAAUAGUAGAUACUCUCGGCAAGGUAUAAACCCUACA AAUUUAAUCCUUGUAGGCA ACUUAUAUUUGUAUUUAUUU (SEQ ID NO: 145) PL4771 0.623 SEQ ID NO: 593 crRNA inAAACAAGGGCGGCUCAACG 45 plasmid UCCUAGAAUCGAGAAAGUA PL4788 0.217SEQ ID NO: 594 crRNA in UGCGUAAGACUUAUUUAUU 45 plasmidGAGCGGUAGAUACUCUCGG UAAGGUAUAAAUUC (SEQ ID NO: 148) PL4862 0.186SEQ ID NO: 595 crRNA in AUGAAUAGGAUUUAUCCUA 34 plasmidUGGGGCAGUUGGUUGCCCU PL4864 0.637 SEQ ID NO: 596 crRNA inUAGCCUGAGGCAUUUAAUG 34 plasmid CACUCGGGAAGUACCUUUU PL4882 0.423SEQ ID NO: 597 crRNA in CUCA (SEQ ID NO: 121) 34 plasmid

Example 12: PAM Screening for D2S Effector Proteins

D2S effector proteins and guide RNA combinations were screened by invitro enrichment (IVE) for PAM recognition. Effector proteins and guideRNAs were expressed and purified from HEK293T cells. Briefly, effectorproteins were complexed with corresponding guide RNAs for 15 minutes at37° C. The complexes were added to an IVE reaction mix. PAM screeningreactions used 10 μl of RNP in 100 reactions with 1,000 ng of a 5′ PAMlibrary in 1× Cutsmart buffer and were carried out for 15 minutes at 25°C., 45 minutes at 37° C. and 15 minutes at 45° C. Reactions wereterminated with 1 μl of proteinase K and 5 μl of 500 mM EDTA for 30minutes at 37° C. Cis cleavage by each complex was confirmed by gelelectrophoresis. Next generation sequencing was performed on cutsequences to confirm enriched PAMs. The PAM enrichment for the top 5%enrichment (PAM 5% in TABLE 20) generally had lower signal due to morenoise than the 1% (PAM 1% in TABLE 20). In some cases, the 1% enrichmentmet the cutoff criteria, but the 5% enrichment did not. In such cases, aPAM is included for the 1% enrichment, but not the 5% enrichment.Complexes (e.g., the composition) and corresponding identified PAMs areprovided in TABLE 20. Additionally, TABLE 20 also shows the effectorprotein Seq ID NO (under Enzyme Seq ID NO), and the cr/sgRNA designationnumber, tracr RNA designation number, and their corresponding sequencesif applicable. FIGS. 7A-7E illustrate PAM preferences for the differentD2S effector proteins used in this example. As shown in TABLE 20,examination the IVE assay revealed the presence of enriched 5′ PAMconsensus sequences for the various D2S effector proteins.

TABLE 20 Compositions for D2S effector protein PAM screening cr/sgRNA #tracrRNA # Comp. cr/sgRNA SEQ ID cr/ tracrRNA SEQ ID Enzyme SEQ ID NO:PAM 1% PAM 5% NO: sgRNA NO: PL5632, R5724, R5780 NNNNTYN (SEQ ID NO:NNNNTYN (SEQ ID NO: R5724 crRNA R5780 227 335) 335) (SEQ ID NO: 598)(SEQ ID NO: 599) PL5636, R5693, R5827 NNNNCCR (SEQ ID NO:NNNNCCN (SEQ ID NO: R5693 crRNA R5827 231 313) 312) (SEQ ID NO: 600)(SEQ ID NO: 601) PL5637, R5865 NNNNNCC (SEQ ID NO: NNNNNCC (SEQ ID NO:R5865 sgRNA 239 315) 315) (SEQ ID NO: 602) PL5637, R5866NNNNNCC (SEQ ID NO: NNNNNCC (SEQ ID NO: R5866 sgRNA 239 315) 315)(SEQ ID NO: 603) PL5638, R4876, R4942 NNNNNCC (SEQ ID NO:NNNNNCC (SEQ ID NO: R4876 crRNA R4942 16 315) 315) (SEQ ID NO: 60)(SEQ ID NO: 107) PL5638, R4849, R5952 NNNNNCC (SEQ ID NO:NNNNNCC (SEQ ID NO: R4849 crRNA R5952 16 315) 315) (SEQ ID NO: 61)(SEQ ID NO: 604) PL5638, R5917 NNNNNCC (SEQ ID NO: NNNNNCC (SEQ ID NO:R5917 sgRNA 16 315) 315) (SEQ ID NO: 605) PL5638, R5919NNNNNCC (SEQ ID NO: NNNNNCC (SEQ ID NO: R5919 sgRNA 16 315) 315)(SEQ ID NO: 606) PL5642, R4852, R4908 NNNNNCC (SEQ ID NO:NNNNNCC (SEQ ID NO: R4852 crRNA R4908 19 315) 315) (SEQ ID NO: 64)(SEQ ID NO: 607) PL5642, R4852, R5955 NNNNNCC (SEQ ID NO:NNNNNCC (SEQ ID NO: R4852 crRNA R5955 19 315) 315) (SEQ ID NO: 64)(SEQ ID NO: 608) PL5642, R5917 NNNNNCC (SEQ ID NO: NNNNNCC (SEQ ID NO:R5917 sgRNA 19 315) 315) (SEQ ID NO: 605) PL5643, R4853, R5956NNNNNCC (SEQ ID NO: NNNNNCC (SEQ ID NO: R4853 crRNA R5956 20 315) 315)(SEQ ID NO: 62) (SEQ ID NO: 609) PL5649, R5853 NNANRTT (SEQ ID NO:NNNNRTT (SEQ ID NO: R5853 sgRNA 207 304) 324) (SEQ ID NO: 610)PL5640, R5917 NNNNNCC (SEQ ID NO: NNNNNCC (SEQ ID NO: R5917 sgRNA 14315) 315) (SEQ ID NO: 605) PL5640, R5919 NNNNNCC (SEQ ID NO:NNNNNCC (SEQ ID NO: R5919 sgRNA 14 315) 315) (SEQ ID NO: 606)PL5640, R4876, R4942 NNNNNCC (SEQ ID NO: NNNNNCC (SEQ ID NO: R4876 crRNAR4942 14 315) 315) (SEQ ID NO: 60) (SEQ ID NO: 611)

Example 13: PAM Screening for D2S Effector Proteins

D2S effector proteins and guide RNA combinations were screened by invitro enrichment (IVE) for PAM recognition. Effector proteins and guideRNAs were expressed and purified from E. coli. Briefly, effectorproteins were complexed with corresponding guide RNAs for 15 minutes at37° C. The complexes were added to an IVE reaction mix. PAM screeningreactions used 10 μl of RNP in 100 μl reactions with 1,000 ng of a 5′PAM library in 1× Cutsmart buffer and were carried out for 15 minutes at25° C., 45 minutes at 37° C. and 15 minutes at 45° C. Reactions wereterminated with 1 μl of proteinase K and 5 μl of 500 mM EDTA for 30minutes at 37° C. Cis cleavage by each complex was confirmed by gelelectrophoresis. Next generation sequencing was performed on cutsequences to confirm enriched PAMs. The PAM enrichment for the top 5%enrichment (PAM 5% in TABLE 21) generally had lower signal due to morenoise than the 1% (PAM 1% in TABLE 21). In some cases, the 1% enrichmentmet the cutoff criteria, but the 5% enrichment did not. In such cases, aPAM is included for the 1% enrichment, but not the 5% enrichment.Complexes (e.g., the composition) and corresponding identified PAMs areprovided in TABLE 21. Additionally, TABLE 21 also shows the effectorprotein Seq ID NO (under Enzyme Seq ID NO), the cr/sgRNA designationnumber, its corresponding sequence if applicable. FIGS. 7A-7E illustratePAM preferences for the different D2S effector proteins used in thisexample. As shown in TABLE 21, examination the WE assay revealed thepresence of enriched 5′ PAM consensus sequences for the various D2Seffector proteins.

TABLE 21 Compositions for D2S effector protein PAM screening Comp.cr/sgRNA # Enzyme SEQ cr/sgRNA_SEQ cr/ ID NO: PAM 1% PAM 5% ID NO: sgRNAPL4970, R7618 NNTTTYN (SEQ NNNNTYN (SEQ ID R7618 sgRNA 232 ID NO: 365)NO: 335) (SEQ ID NO: 612) PL4991, R7605 NNNWNTG (SEQ NNNNNTG (SEQ IDR7605 sgRNA 233 ID NO: 360) NO: 318) (SEQ ID NO: 613) PL4992, R7608NNNRTRG (SEQ NNNNNNG (SEQ ID R7608 sgRNA 240 ID NO: 343) NO: 301)(SEQ ID NO: 614) PL5632, R7620 NNNNTYN (SEQ NNNNTYN (SEQ ID R7620 sgRNA227 ID NO: 335) NO: 335) (SEQ ID NO: 615)

Example 14: D2S Enzyme Edit Genomic DNA in Mammalian Cells

An enzyme was tested for its ability to produce indels in HEK293T cells.Briefly, a plasmid encoding the enzyme and guide RNA was delivered bylipofection to HEK293T cells. Cells were incubated for approximately 48hours before being lysed. Indels were detected by next generationsequencing of PCR amplicons at the targeted loci and indel percentagewas calculated as the fraction of sequencing reads containing insertionsor deletions relative to an unedited reference sequence. Sequencinglibraries with less than 20% of reads aligning to the reference sequencewere excluded from the analysis for quality control purposes. “Noplasmid” and SpyCas9 were included as negative and positive controls,respectively. TABLE 22 describes the sequence of the single guide RNAtested and percent of reads with indels. Additionally, TABLE 22 showsthe composition tested, the PAM 1% enrichment sequence, the effectorprotein Seq ID NO (under Enzyme Seq ID NO), and the sgRNA sequence ifapplicable. The results in TABLE 22 show the D2S enzyme had nucleaseactivity.

TABLE 22 Indels by an D2S effector Enzyme SEQ Comp. PAM 1% Indel percentID NO: sgRNA SEQ ID NO: PL6015 TNTG (SEQ ID NO: 368) 0.385 228SEQ ID NO: 616

Example 15: PAM Screening for D2S Effector Proteins

D2S effector proteins and guide RNA combinations were screened by invitro enrichment (IVE) for PAM recognition. Effector proteins and guideRNAs were expressed and purified from E. coli cells. Briefly, effectorproteins were complexed with corresponding guide RNAs for 15 minutes at37° C. The complexes were added to an IVE reaction mix. PAM screeningreactions used 10 μl of RNP in 100 μl reactions with 1,000 ng of a 5′PAM library in 1× Cutsmart buffer and were carried out for 15 minutes at25° C., 45 minutes at 37° C. and 15 minutes at 45° C. Reactions wereterminated with 1 μl of proteinase K and 5 μl of 500 mM EDTA for 30minutes at 37° C. Cis cleavage by each complex was confirmed by gelelectrophoresis. Next generation sequencing was performed on cutsequences to confirm enriched PAMs. The PAM enrichment for the top 5%enrichment (PAM 5% in TABLE 23) generally had lower signal due to morenoise than the 1% (PAM 1% in TABLE 23). In some cases the 1% enrichmentmet the cutoff criteria, but the 5% enrichment did not. In such cases, aPAM is included for the 1% enrichment, but not the 5% enrichment.Complexes (e.g., the composition) and corresponding identified PAMs areprovided in TABLE 23. Additionally, TABLE 23 also shows the effectorprotein Seq ID NO (under Enzyme Seq ID NO), and the cr/sgRNA designationnumber, tracr RNA designation number, and their corresponding sequencesif applicable. FIGS. 7A-7E illustrate PAM preferences for the differentD2S effector proteins used in this example. As shown in TABLE 23,examination the IVE assay revealed the presence of enriched 5′ PAMconsensus sequences for the various D2S effector proteins.

TABLE 23 Compositions for D2S effector protein PAM screening Enzymecr/sgRNA # tracrRNA # SEQ cr/sgRNA SEQ ID tracrRNA SEQ ID Comp. ID NO:PAM1% PAM5% NO: cr/sgRNA NO: PL5370, R6401, 215 NNNRTRG (SEQ ID NO:NNNRTRG R6401 crRNA R6631 R6631 343) (SEQ ID NO: 343) (SEQ ID NO: 617)(SEQ ID NO: 618) PL5370, R6401, 215 NNNRTRG (SEQ ID NO: NNNRTRG (SEQ IDR6401 crRNA R6630 R6630 343) NO: 343) (SEQ ID NO: 617) (SEQ ID NO: 619)PL5370, R6708 215 NNNRTRG (SEQ ID NO: NNNRTRG (SEQ ID R6708 sgRNA 343)NO: 343) (SEQ ID NO: 620) PL5370, R6707 215 NNNRTRG (SEQ ID NO:NNNRTRG (SEQ ID R6707 sgRNA 343) NO: 343) (SEQ ID NO: 621)

Example 16: PAM Screening for D2S Effector Proteins

D2S effector proteins and guide RNA combinations were screened by invitro enrichment (IVE) for PAM recognition. Effector proteins and guideRNAs were expressed and purified from E. coli. Briefly, effectorproteins were complexed with corresponding guide RNAs for 15 minutes at37° C. The complexes were added to an IVE reaction mix. PAM screeningreactions used 10 μl of RNP in 100 μl reactions with 1,000 ng of a 5′PAM library in 1× Cutsmart buffer and were carried out for 15 minutes at25° C., 45 minutes at 37° C. and 15 minutes at 45° C. Reactions wereterminated with 1 μl of proteinase K and 5 μl of 500 mM EDTA for 30minutes at 37° C. Cis cleavage by each complex was confirmed by gelelectrophoresis. Next generation sequencing was performed on cutsequences to confirm enriched PAMs. The PAM enrichment for the top 5%enrichment (PAM 5% in TABLE 24) generally had lower signal due to morenoise than the 1% (PAM 1% in TABLE 24). Complexes (e.g., thecomposition) and corresponding identified PAMs are provided in TABLE 24.Additionally, TABLE 24 also shows the effector protein Seq ID NO (underEnzyme Seq ID NO), and the cr/sgRNA designation number, tracr RNAdesignation number, and their corresponding sequences if applicable. Asshown in TABLE 24, the IVE assay revealed the presence of enriched 5′PAM consensus sequences for the effector protein SEQ ID NO: 23.

TABLE 24 Compositions for D2S effector protein PAM screening cr/sgRNA #tracrRNA # Comp. cr/sgRNA cr/ tracrRNA Enzyme Seq ID NO: PAM 1% PAM_5%Seq ID NO: sgRNA Seq ID NO: PL3296, R4856, R4893 NNNNKCG (SEQ ID NO:NNNNKYG (SEQ ID NO: R4856 crRNA R4893 23 326) 327) (SEQ ID NO: 68)(SEQ ID NO: 120) PL3296, R4856, R4893 NNNNKCG (SEQ ID NO:NNNNKYG (SEQ ID NO: R4856 crRNA R4893 23 326) 327) (SEQ ID NO: 68)(SEQ ID NO: 120) PL3296, R4856, R4893 NNNNTCG (SEQ ID NO:NNNNTYG (SEQ ID NO: R4856 crRNA R4893 23 325) 328) (SEQ ID NO: 68)(SEQ ID NO: 120) PL3296, R4886 NNNNTYG (SEQ ID NO: NNNNTYG (SEQ ID NO:R4886 sgRNA 23 328) 328) (SEQ ID NO: 149) PL3296, 4886NNNNTCG (SEQ ID NO: NNNNTYG (SEQ ID NO: R4886 sgRNA 23 325) 328)(SEQ ID NO: 149) PL3296, R4886 NNNNTYG (SEQ ID NO: NNNNTYG (SEQ ID NO:R4886 sgRNA 23 328) 328) (SEQ ID NO: 149)

Example 17: Guide RNA Optimization of Repeat Sequences

Guide RNAs were optimized for specific repeat sequences and designed toincrease indel frequency. Repeat sequences were mutated and/or truncatedfor optimization. Guides with the optimized repeat sequence were testedin the indel experiments described herein for their ability to produceindels. Table 25 shows the different parts of the optimized guide RNAsequences (i.e., the tracrRNA sequence, the linker sequence, the repeatsequence, the spacer sequence, and the full sgRNA sequence).

TABLE 25 Optimized Guide Sequences Enzyme Seq ID Linker Repeat SpacerNO: TracrRNA Sequence sequence Sequence sequence Full sgRNA sequence 23UGGGGCAGUUGGUUGCCCUU GAAA UGGUAUA GUGCCUUAUGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUUU AGCCUGAGGCAUUUAUUGCA (SEQ ID UCCAACGUUUCUUC AUUGCACUCGGGAAGUACCAUUUCUCAGAAAUGG CUCGGGAAGUACCAUUUCUCNO: 623) (SEQ ID NO: AUCU (SEQ UAUAUCCAACGUGCCUUAGUUUCUUCAUCU(SEQA (SEQ ID NO: 622) 624) ID NO: 625) ID NO: 626) 23 UGGGGCAGUUGGUUGCCCUUUACAUCC UCUAGGCG UGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUUU AGCCUGAGGCAUUUAUUGCAAAC (SEQ CCCGCUAA AUUGCACUCGGGAAGUACCAUUACAUCCAACUCU CUCGGGAAGUACCAUID NO: 628) GUUC (SEQ AGGCGCCCGCUAAGUUC (SEQ ID NO: 517)(SEQ ID NO: 627) ID NO: 629) 23 UGGGGCAGUUGGUUGCCCUU GAAA UGGUACACGUGCUGU UGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUUU AGCCUGAGGCAUUUAUUGCA (SEQ IDUCCAAC UUCCUCCC AUUGCACUCGGGAAGUACCAUUUCUCAGAAAUGG CUCGGGAAGUACCAUUUCUCNO: 623) (SEQ ID NO: CACA (SEQ UACAUCCAACCGUGCUGUUUCCUCCCCACAA (SEQ ID NO: 622) 630) ID NO: 631) (SEQ ID NO: 632) 23AUGGGGCAGUUGGUUGCCCU GAAA AAC (SEQ CGUGCUGUAUGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUU UAGCCUGAGGCAUUUAAUGC (SEQ IDID NO: 634) UUCCUCCC UAAUGCACUCGGGAGAAAAACCGUGCUGUUUCCU ACUCGGGA (SEQ IDNO: 623) CACG (SEQ CCCCACG (SEQ ID NO: 636) NO: 633) ID NO: 635) 23AUGGGGCAGUUGGUUGCCCU GAAA AUCCAAC CGUGCUGUAUGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUU UAGCCUGAGGCAUUUAAUGC (SEQ ID(SEQ ID NO: UUCCUCCC UAAUGCACUCGGGAAGUACCGAAAAUCCAACCGUACUCGGGAAGUACC (SEQ NO: 623) 638) CACG (SEQGCUGUUUCCUCCCCACG (SEQ ID NO: 639) ID NO: 637) ID NO: 635) 23AUGGGGCAGUUGGUUGCCCU GAAA AGGUACA CGUGCUGUAUGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUU UAGCCUGAGGCAUUUAAUGC (SEQ ID UCCAACUUCCUCCC UAAUGCACUCGGGAAGUACCUUUUCUCAGAAAAG ACUCGGGAAGUACCUUUUCUNO: 623) (SEQ ID NO: CACG (SEQ GUACAUCCAACCGUGCUGUUUCCUCCCCACGCA (SEQ ID NO: 640) 641) ID NO: 635) (SEQ ID NO: 642) 23AUGGGGCAGUUGGUUGCCCU GAAA CCAAC UCUAGGCGAUGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUU UAGCCUGAGGCAUUUAAUGC (SEQ ID(SEQ ID NO: CCCGCUAA UAAUGCACUCGGGAAGUACCUUUUCUCAGAAACCACUCGGGAAGUACCUUUUCU NO: 623) 643) GUUC (SEQAACUCUAGGCGCCCGCUAAGUUC (SEQ ID CA (SEQ ID NO: 640) ID NO: 629) NO: 644)

Example 18: Activation of Gene Expression with CasLambda Fusion(CRISPRa)

Multiple gene targets, including NEUROD1, HBG1, ASCL1, and LIN28A, wereselected for testing the ability of VPR-CasM fusions to increaseendogenous gene expression. A nucleic acid vector encoding VPR (SEQ IDNO: 300) was fused to catalytically inactive CasM proteins at their N′terminus with an XTEN10 linker (GSPAGSPTST SEQ ID NO: 711) and at leastone CasM gRNA targeting an endogenous gene were introduced to cells vialipofection. Relative amounts of RNA, indicative of relative geneexpression, were quantified with RT-qPCR. An increase of gene expressionwas observed with individual different gRNAs. A scramble sequence spacer(nt), and a pooled sample were used as negative controls. Acatalytically inactive “dead” Cas9 fusion, dCas9, was included as apositive control. The fusion proteins were tested for their ability toincrease expression in NEUROD1, HBG1, ASCL1, and LIN28A by differentVPR-CasM fusions. FIG. 8A shows the change in gene expression byCasM.286251 (D267A) (SEQ ID NO: 728) with an N terminal VPR fused by anXTEN10 linker, which demonstrated upregulation with guides 1-8 forASCL1, HBG1 and LIN28A relative to the scrambled sequence control. FIG.8B shows the change in gene expression by CasM.19952 (D267A) (SEQ ID NO:729) with an N terminal VPR fused by an XTEN10 linker, whichdemonstrated upregulation with guides 1-8 for ASCL1 and HBG1 and guide 3for NEUROD1 relative to the scrambled sequence control. FIG. 8C showedthe change in gene expression by CasM.19952 (D267N) (SEQ ID NO: 730)with an N terminal VPR fused by an XTEN10 linker, which demonstratedupregulation with guides 1-8 for ASCL1 and guides 2-3 for NEUROD1relative to the scrambled sequence control. FIG. 8D showed the change ingene expression by CasM.19952 (E363Q) (SEQ ID NO: 731) with an Nterminal VPR fused by an XTEN10 linker, which demonstrated upregulationwith guides 1-8 for ASCL1 and guides 2-3 for NEUROD1 relative to thescrambled sequence control. The PAM sequence for the CasM 19952 enzymeswas NTCG (SEQ ID NO: 369) comprising the repeat sequence of:UGGGGCAGUUGGUUGCCCUUAGCCUGAGGCAUUUAUUGCACUCGGGAAGUACCAUUUCUCAGAAAUGGUACAUCCAAC (SEQ ID NO: 645). The PAM sequence for the CasM286251 enzymes was RTTR (SEQ ID NO: 370) comprising the repeat sequenceof: AUGGGGCAGUUGGUUGCCCUUAGCCUGAGGAAUUUAAUUCACUCGGGAAGUACCUUUCUCAUGAAAUGGUACAUCCAAC (SEQ ID NO: 646). Table 26 denotes the spacersequence for the designated guide IDs in the FIGS. 8A-8D, the genetarget, and the type of nucleases tested. The results show thecatalytically inactive CasM proteins fused to VPR can increase theexpression of genes.

TABLE 26 Guide sequences for Activation of Gene Expression ID in FIGsGene 8A-8D Spacer sequence target Nucleases g1CCCCCCACUCCCCGCUGCUG (SEQ ID NO: 647) ASCL1 CasM.19952 g2AAGUGGCAUCCUCUCUGAGC (SEQ ID NO: 648) ASCL1 CasM.19952 g3CUUCCUCGUCUGCAGCCACA (SEQ ID NO: 649) ASCL1 CasM.19952 g4ACUUUUCCUGUUUUCUCUCU (SEQ ID NO: 650) ASCL1 CasM.19952 g5GGUUCCUCGGUGACCCUAGA (SEQ ID NO: 651) ASCL1 CasM.19952 g6GUGACCCUAGAAAUUGGAGC (SEQ ID NO: 652) ASCL1 CasM.19952 g7UCUGCAGCCACAGAAUAUGG (SEQ ID NO: 653) ASCL1 CasM.19952 g8AGGAGCCACAGAGCAUUGAG (SEQ ID NO: 654) ASCL1 CasM.19952 g1GAGGAGGGCGGGAGACGAGC (SEQ ID NO: 655) NEUROD1 CasM.19952 g2UCUCCCGCCCUCCUCCGACA (SEQ ID NO: 656) NEUROD1 CasM.19952 g3CCAGUUAGAGACUCCGCGGA (SEQ ID NO: 657) NEUROD1 CasM.19952 g4CUCUGAUCUAGACCUAGUUA (SEQ ID NO: 658) NEUROD1 CasM.19952 g5CGCCGGAAGUAGGACAGAGG (SEQ ID NO: 659) NEUROD1 CasM.19952 g6AAAGGAGCGAGGACUCUUCA (SEQ ID NO: 660) NEUROD1 CasM.19952 g7CUCCUUUCGAUUUCUUGUCC (SEQ ID NO: 661) NEUROD1 CasM.19952 g8AUUUCUUGUCCUGACACUGG (SEQ ID NO: 662) NEUROD1 CasM.19952 g1GAACAAGGCAAAGGCUAUAA (SEQ ID NO: 663) HBG1 CasM.19952 g2AGUUAUAAUAGUGUGUGGAC (SEQ ID NO: 664) HBG1 CasM.19952 g3AAUAUUAGUGUACUUUAGAC (SEQ ID NO: 665) HBG1 CasM.19952 g4UUGAGCCCCUUCCUCGCUGC (SEQ ID NO: 666) HBG1 CasM.19952 g5AAGGUACAUGUGCAGGAUGU (SEQ ID NO: 667) HBG1 CasM.19952 g6GCAACCAGUAGCCCUUGCGU (SEQ ID NO: 668) HBG1 CasM.19952 g7CACUUUCUUUCUUUGUCCUU (SEQ ID NO: 669) HBG1 CasM.19952 g8GUGUUCAGUGGAUUAGAAAC (SEQ ID NO: 670) HBG1 CasM.19952 g1GAGAAGAAGCUGCUACAUCU (SEQ ID NO: 671) LIN28A CasM.19952 g2UUAACAAAUAUUAUUAGCAG (SEQ ID NO: 672) LIN28A CasM.19952 g3UCCUACCCCCACCCCAUCCC (SEQ ID NO: 673) LIN28A CasM.19952 g4GAGAUGGACAAUGGCCCGGG (SEQ ID NO: 674) LIN28A CasM.19952 g5CUCCGUGUACCUCUGUUCCU (SEQ ID NO: 675) LIN28A CasM.19952 g6GUGGAGAAGAUUGAAUUCAG (SEQ ID NO: 676) LIN28A CasM.19952 g7UACGGGGUGCUCUCCAAGAA (SEQ ID NO: 677) LIN28A CasM.19952 g8UGGGGUAAAAAGGACAAGAG (SEQ ID NO: 678) LIN28A CasM.19952 g1AAAAGGCGGACGCACUCCGG (SEQ ID NO: 679) ASCL1 CasM.286251 g2GGGGAGGGACUCCGUCCAGA (SEQ ID NO: 680) ASCL1 CasM.286251 g3GAGACCAUAUUCUGUGGCUG (SEQ ID NO: 681) ASCL1 CasM.286251 g4AGGUGUAUAGGUGGAAAGAC (SEQ ID NO: 682) ASCL1 CasM.286251 g5UUCUCUUCGGGUUCCUCGGU (SEQ ID NO: 683) ASCL1 CasM.286251 g6GAGCAAAUUACGAUUGAAGU (SEQ ID NO: 684) ASCL1 CasM.286251 g7CGAUUGAAGUUUAGAAACAU (SEQ ID NO: 685) ASCL1 CasM.286251 g8AAGUUUAGAAACAUGGUUGG (SEQ ID NO: 686) ASCL1 CasM.286251 g1UCGGAGGAGGGCGGGAGACG (SEQ ID NO: 687) NEUROD1 CasM.286251 g2AUCUCUCCUGCGGGUAAAAA (SEQ ID NO: 688) NEUROD1 CasM.286251 g3GCUUUUCCCUUCCUUCCCUC (SEQ ID NO: 689) NEUROD1 CasM.286251 g4ACAUUAGCUUUUCCCUUCCU (SEQ ID NO: 690) NEUROD1 CasM.286251 g5ACUAGGUCUAGAUCAGAGCG (SEQ ID NO: 691) NEUROD1 CasM.286251 g6GCGCCAAAGGAUGGCUUCUC (SEQ ID NO: 692) NEUROD1 CasM.286251 g7GGAGAAGCCAUCCUUUGGCG (SEQ ID NO: 693) NEUROD1 CasM.286251 g8GGGAACUAAUCUCAACGCUG (SEQ ID NO: 694) NEUROD1 CasM.286251 g1GUCAAGUUUGCCUUGUCAAG (SEQ ID NO: 695) HBG1 CasM.286251 g2GCCAGCCUUGCCUUGACCAA (SEQ ID NO: 696) HBG1 CasM.286251 g3GUCAAGGCAAGGCUGGCCAA (SEQ ID NO: 697) HBG1 CasM.286251 g4AGAUAGUGUGGGGAAGGGGC (SEQ ID NO: 698) HBG1 CasM.286251 g5GCAGUGGUUUCUAAGGAAAA (SEQ ID NO: 699) HBG1 CasM.286251 g6GAGAAAAACUGGAAUGACUG (SEQ ID NO: 700) HBG1 CasM.286251 g7GUACAUGCUUUAGCUUUAAA (SEQ ID NO: 701) HBG1 CasM.286251 g8AGAGAUAAUGGCAAAAGUCA (SEQ ID NO: 702) HBG1 CasM.286251 g1GUUCGGAGAAGAAGCUGCUA (SEQ ID NO: 703) LIN28A CasM.286251 g2UGCGGGGGAAGAUGUAGCAG (SEQ ID NO: 704) LIN28A CasM.286251 g3UCUUUUAGAAUUUGGGAGCC (SEQ ID NO: 705) LIN28A CasM.286251 g4GGUCAUUGUCUUUUAGAAUU (SEQ ID NO: 706) LIN28A CasM.286251 g5UGGGGGAGGGCCGGAGCUGG (SEQ ID NO: 707) LIN28A CasM.286251 g6UGCGUGUGGGGAGGGGGUGU (SEQ ID NO: 708) LIN28A CasM.286251 g7GGGGAGGGAGGUGUGAGCCU (SEQ ID NO: 709) LIN28A CasM.286251 g8GCCAGCGCCGCCAGGCUCAC (SEQ ID NO: 710) LIN28A CasM.286251

Example 19: Base Editing with Dead CasM.19952 Variants—Deaminase FusionProteins

Multiple nucleic acid vectors encoding the catalytically inactivevariant dCasM.19952 fusion protein (SEQ ID NO: 729 (dCasM.19952 (D267A))were constructed as shown in FIG. 9 and assessed for base editingactivity. These fusion proteins comprised a catalytically inactivevariant dCasM.19952 (D267A) SEQ ID NO: 729, also referred to as “deadCasM” of the active CasM.19952 (SEQ ID NO: 23), and were fused to eitherABE8e (SEQ ID NO: 713), ABE8.20m (SEQ ID NO: 714), APOBEC3, (SEQ ID NO:732) or AncBE4Max (SEQ ID NO: 733), via an XTEN10 linker (GSPAGSPTST SEQID NO: 711), an XTEN40 (GSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPA SEQ IDNO: 734), or an XTEN80 linker(GSPAGSPTSTEEGTSESATPESGPGTSTEPSEGSAPGSPAGSPTSTEEGTSTEPSEGSAPGTSTEPSEGSAPGTSESATP SEQ ID NO: 735) The base editing effector sequences canbe found in Table 27. These vectors also encoded an amino acid sequencecontaining a nuclear localization signal (PKKKRKV; SEQ ID NO: 712) fusedto the dead CasM.19952. Guides with no effectors served as negativecontrols comprising no deaminase, or base editing function. Targetsequences included sequences located in the genes for B2M, TRAC, CIITA,or NGCG_B2M. Guide RNA spacers sequences and their respective targetsare provided in Table 28. Cells were transfected with the nucleic acidvectors and guide RNAs. After sufficient incubation, DNA was extractedfrom the transfected cells. Target sequences were PCR amplified andsequenced by NGS and MiSeq. The presence of base modifications wasanalyzed from sequencing data after subtraction of background editing(using the no deaminase control). FIG. 9 shows the indel percentage of(catalytically active) CasM.19952 and gRNAs at respective target sites.

Designs with observed based editing is shown in Table 29. Editing wasobserved in the CIITA_26, CIITA_1, and TRAC_5 targets. Little to noediting was observed in the B2M_5, CIITA_12, CIITA_19, CIITA_6, TRAC_1,TRAC_3, CIITA_15, NGCG_B2M_3, CIITA_9, and CIITA 20 targets. The rows inTable 29 show distinct fusion protein designs (for example, APOBEC3(base editor) fused via a C-XTEN80 linker to dCasM.19952). The columnsrepresent distinct guide RNA spacer sequences from Table 28. The baseswhere editing was observed are represented as the position within thespacer and shown under the guide RNAs. The bases in parentheses indicatebases where editing was not observed. These bases are either the nextclosest base to the observed edited bases or any bases near the putativeediting window. The prefix + indicates number of positions after thespacer sequence. FIG. 10A and FIG. 10B shows the change in base callpercentage along the spacer sequence for the CIITA t26 target. Thespacer sequence is shown on the upper X-axis and the change in base callis shown in the Y-axis. FIG. 10A shows an about 1% base change inposition A9 to a G base with the constructABE8e-XTEN10-dCasM.19952(D267A). FIG. 10B shows an about 0.70%-0.75%base change in position C6 and C8 to a T base with the constructAncBE4Max-XTEN10-dCasM.19952(D267A). The results show dCasM.19952 can befused with a base editing enzyme to generate base edits in a sequence.

TABLE 27 Fusion effector sequences SEQ ID Name NO: Sequence ABE8e 713SEVEFSHEYWMRHALTLAKRARDEREVPVGAVLVLNNRVIGE (base editor)GWNRAIGLHDPTAHAEIMALRQGGLVMQNYRLIDATLYVTFEPCVMCAGAMIHSRIGRVVFGVRNSKRGAAGSLMNVLNYPGMNHRVEITEGILADECAALLCDFYRMPRQVFNAQKKAQSSIN ABE8.20m 714SEVEFSHEYWMRHALTLAKRARDEREVPVGAVLVLNNRVIGE (base editor)GWNRAIGLHDPTAHAEIMALRQGGLVMQNYRLYDATLYSTFEPCVMCAGAMIHSRIGRVVFGVRNAKTGAAGSLMDVLHHPGMNHRVEITEGILADECAALLCRFFRMPRRVFNAQKKAQSSTD APOBEC3 732EASPASGPRHLMDPHIFTSNFNNGIGRHKTYLCYEVERLDNGTS (base editor)VKMDQHRGFLHNQAKNLLCGFYGRHAELRFLDLVPSLQLDPAQIYRVTWFISWSPCFSWGCAGEVRAFLQENTHVRLRIFAARIYDYDPLYKEALQMLRDAGAQVSIMTYDEFKHCWDTFVDHQGCP FQPWDGLDEHSQALSGRLRAILQNQGNAncBE4Max 733 SSETGPVAVDPTLRRRIEPHEFEVFFDPRELRKETCLLYEIKWGT(base editor) SHKIWRHSSKNTTKHVEVNFIEKFTSERHFCPSTSCSITWFLSWSPCGECSKAITEFLSQHPNVTLVIYVARLYHHMDQQNRQGLRDLVNSGVTIQIMTAPEYDYCWRNFVNYPPGKEAHWPRYPPLWMKLYALELHAGILGLPPCLNILRRKQPQLTFFTIALQSCHYQRLPPH ILWATGLK

TABLE 28 Guide Spacer Sequences for Base Editing Target Spacer B2M_5CUCCGUGGCCUUAGCUGUGC (SEQ ID NO: 715) CIITA_1GUGAGGAAGCACCUGAGCCC (SEQ ID NO: 716) CIITA_12CUGCAUCCCUGCUCAGGCUA (SEQ ID NO: 717) CIITA_19UCCUGGAGAGAACAGGCAAU (SEQ ID NO: 718) CIITA_26CAGCUCUCAGCCACCUUCCC (SEQ ID NO: 719) CIITA_6GGACCUAAAGAAACUGGAGU (SEQ ID NO: 720) TRAC_1ACCAGCUUGACAUCACAGGA (SEQ ID NO: 721) TRAC_3GAACCCAAUCACUGACAGGU (SEQ ID NO: 722) TRAC_5GUGAAUAGGCAGACAGACUU (SEQ ID NO: 723) CIITA_15CAGAUGCAGUUAUUGUACAA (SEQ ID NO: 724) NGCG_B2M_3CGAGCACAGCUAAGGCCACG (SEQ ID NO: 725) CIITA_9CUCCAUCAGCCACUGACCUG (SEQ ID NO: 726) CIITA_20GGGACGAGGGUGUCUCGCAG (SEQ ID NO: 727)

TABLE 29 Constructs with observed based editing in target sequencesConstruct CIITA_26 CIITA_1 TRAC_5 APOBEC3 C-XTEN80 C6 (C4, C8) BE4MaxN-XTEN10 C6, C8 (C4, C11) ABE8e N-XTEN10 A9 (A2, A13) A8 (A7, A11) A7(A5, A11) BE4Max N-XTEN40 C6 (C4, C8) ABE8e N-XTEN40 A9 (A2, A13)APOBEC3 N-XTEN80 C6 (C4, C8) ABE8e N-XTEN80 A9, A13 (A2, A + 3)

Example 20: CasM.19952 Sequence Homology

It is well known that sequence diversity is a characteristic ofCRISPR/Cas systems and that effector proteins can exhibit low levels ofsequence identity yet belong to the same class, type or subtype ofCRISPR effector protein. To assess sequence diversity between the D2Seffector proteins, the sequences of the effector proteins were alignedusing pairwise MUSCLE alignment. Each aligned sequence was compared tothe CasM.19952 (SEQ ID NO: 23) aligned sequence. As shown in Table 30,19 of the D2S effector proteins are at least 75% identical toCasM.19952.

TABLE 30 Sequence alignment of D2S effector proteins SEQ ID Effectorprotein Identity to NO name CasM.19952 23 CasM.19952  100.00 26CasM.288480 96.15 24 CasM.274559 94.66 208 CasM.272451 92.31 222CasM.289248 84.19 28 CasM.289206 84.19 29 CasM.290598 83.97 217CasM.287826 83.76 229 CasM.294406 82.09 25 CasM.286251 81.88 30CasM.290816 81.74 219 CasM.287936 81.66 207 CasM.270012 81.41 32CasM.295231 81.10 202 CasM.19498  79.96 220 CasM.288450 79.32 205CasM.19948  78.63 34 CasM.279423 78.42 31 CasM.295071 78.21 27CasM.288668 75.43 213 CasM.285333 61.65 225 CasM.290380 60.91 216CasM.287128 60.63 215 CasM.286678 59.47 22 CasM.19924  53.73

Example 21: D2S Effector Protein Motif Analysis

The MEME algorithm (Multiple EM for Motif Elicitation, Bailey & Elkan,1994) was used to identify sequence motifs that are shared by D2Seffector proteins (SEQ ID Nos 1-45 and 202-240). The analysis wasperformed using the default parameters. This analysis identified theseven highly conserved motifs that are shown in FIG. 11A. The number ofanalyzed sequences that include the motifs is provided in Table 31 alongwith the length of each motif

TABLE 31 D2S motif analysis Number of sequences that Motif ID includethe motif (out of 84) Motif length MEME_1 79 50 MEME_2 81 29 MEME_3 8021 MEME_4 30 41 MEME_5 77 21 MEME_6 76 15 MEME_7 82 23

The weblogos in FIG. 11A provide multilevel consensus sequences.Weblogos corresponding to MEMS_1, MEME_2, MEME_3, MEMS 4, MEME_5, MEME_6and MEME_7 are shown in FIG. 11A. This multilevel sequence analysis ofthe weblogos in FIG. 11A was used to generate the PROSITE motifs shownin Table 32. In Table 32, the brackets indicate amino acids in thealternative, for example [KG] means K or G. In another example [VFL]means V, F, or L. PROSITE motifs are routinely used in the art toconveniently illustrate consensus motifs.

TABLE 32 D2S PROSITE motifs Motif ID PROSITE motif SEQ ID NO MEME_1[KG][ET]F[VFL][LG][RK]NW[SRT]Y[YF][EDQ]LQ[NT][MK] 793I[EK]YKA[KA]E[YA]GIKV[VE][KY][IV][NR]P[AK]YTS[QRK][RT]CS[WK]CG[YQH]I[GD][KF][RD][NF] MEME_2T[QL]NH[LRQ][YF]SR[EA][VL][IV][DEN][FY]AVK[NH]GA 794 [GA]TI[QH]ME[DN]LSGMEME_3 L[ND][KP][NKE][IK][VI][VL]GVDLG[IV][NS][VY]P[LA]Y 795[AV][AS][TV] MEME_4 QW[GN]LLYHINDNLY[KR]AANNISSKLYLD[DE]HVSSMV 796R[LM]KH[AD]EYL MEME_5V[LK]RG[EK]R[SA][IL][PR][NTS][YF][KR][KS][GDN][MQ]P 797[IL]P[FI][HP][WC] MEME_6 [NH]ADYNA[AS][RQ]N[IL][AS][IN][SK][KD][ID] 798MEME_7 [RY][LC][GK][GT][TG]R[GI]G[HK]GRK[KR][KR]LEP[LI] 799[EY][RK]L[RE][DG]

The location of the detected motifs in the effector proteins isillustrated in FIG. 11B. All motifs illustrated in FIG. 11B shared atleast 36.5% identity to the PROSITE sequences shown in Table 32. Ingeneral, MEME_4 and MEME_5 are located in the N terminal half of theeffector protein. In general, MEME_1, MEME_2, MEME_6, and MEME_7 arelocated in the C terminal half of the effector protein. In general, theorder of MEMEs from N terminus to C terminus is: MEME_4, MEME_5, MEME_3,MEME_7, MEME_2, MEME_1, MEME_6.

In general, the motifs demonstrate a similar distribution in all D2Seffector domains shown in FIG. 11B, namely MEME_4, MEME_5, MEME_3,MEME_7, MEME_2, MEME_1 and MEME_6 (from N- to C-terminus). All sevenmotifs were identified in a lot of the effector proteins shown in FIG.11B. However, all seven motifs are not always identified in the effectorproteins. For example, in some instances, MEME_4 was not identified, butthe effector protein includes MEME_5, MEME_3, MEME_7, MEME_2, MEME_1 andMEME_6 (from N- to C-terminus) e.g. for CasM.298706.

The degree of identity of PROSITE motifs MEME_1 to MEME_7 in the D2Seffector proteins that share greater than 75% identity with CasM.19952was calculated. In calculating these degrees of identity, eachalternative in a prosite motif was given an equal weight. For example,both NAD or HAD share 100% identity with the prosite motif [NH]AD. Theoutput from this identity analysis is shown in Table 33.

TABLE 33 conservation of the D2S motifs Effector Protein MEME_1 MEME_2MEME_3 MEME_4 MEME_5 MEME_6 MEME_7 CasM.19498 90 82.75862069 100 10095.23809524 100 91.30434783 CasM.19948 92 86.20689655 95.2380952492.68292683 100 100 86.95652174 CasM.19952 92 93.10344828 95.23809524100 95.23809524 100 82.60869565 CasM.270012 88 82.75862069 95.23809524100 100 100 86.95652174 CasM.272451 88 93.10344828 95.23809524 100 10093.33333333 86.95652174 CasM.274559 94 93.10344828 90.47619048 10095.23809524 93.33333333 86.95652174 CasM.279423 88 82.75862069 10095.12195122 100 93.33333333 82.60869565 CasM.286251 94 93.1034482895.23809524 97.56097561 100 93.33333333 82.60869565 CasM.287826 9089.65517241 100 97.56097561 95.23809524 100 86.95652174 CasM.287936 9493.10344828 95.23809524 97.56097561 100 93.33333333 78.26086957CasM.288450 92 72.4137931 95.23809524 90.24390244 95.2380952486.66666667 91.30434783 CasM.288480 92 93.10344828 100 97.5609756195.23809524 100 86.95652174 CasM.288668 94 86.20689655 95.2380952497.56097561 100 93.33333333 69.56521739 CasM.289206 88 89.65517241 100100 100 93.33333333 91.30434783 CasM.289248 88 86.20689655 100 100 10093.33333333 91.30434783 CasM.290598 90 89.65517241 95.2380952495.12195122 100 100 82.60869565 CasM.290816 94 79.31034483 95.2380952492.68292683 95.23809524 100 82.60869565 CasM.294406 92 93.1034482895.23809524 97.56097561 95.23809524 86.66666667 82.60869565 CasM.29507194 93.10344828 95.23809524 97.56097561 100 93.33333333 82.60869565CasM.295231 92 79.31034483 95.23809524 92.68292683 95.23809524 10086.95652174

Table 33 shows that motifs MEME_1 to MEME_7 are highly conserved betweenD2S effector proteins that are at least 75% identical to CasM.19952. Inparticular, all effector proteins described in Table 33 comprise anamino acid sequence that is at least 69.5% or more identical to each ofMEME_1 to MEME_7. All effector proteins described in Table 33 comprisean amino acid sequence that is at least 72% identical to each of MEME_1to MEME_6. All effector proteins described in Table 33 comprise an aminoacid sequence that is at least 90% identical to each of MEME_1, andMEME_3 to MEME_6.

MEME_4 was found to be a particularly useful motif for identifying thegroup of D2S effector proteins and distinguishing these D2S effectorproteins from previously known effector proteins. All effector proteinsdescribed in Table 33 comprise an amino acid sequence that is at least90% identical to MEME_4. In some cases, the D2S effector proteinsinclude an amino acid sequence that is at least 37% identical to MEME_4.

Example 22: D2S Enzyme Edit Genomic DNA in Mammalian Cells

D2S effectors were tested for their ability to produce indels in HEK293Tcells. Briefly, 300 ng of plasmids expressing effector and gRNA weredelivered by lipofection to HEK293T cells in 96 well platesusingTransIT-293 reagent at the ratio of 2:1 lipid:DNA. Cells wereincubated for 3 days before being lysed and subjected to PCRamplification. Indels were detected by next generation sequencing of PCRamplicons at the targeted loci and indel percentage was calculated asthe fraction of sequencing reads containing insertions or deletionsrelative to an unedited reference sequence. Sequencing libraries withless than 20% of reads aligning to the reference sequence were excludedfrom the analysis for quality control purposes. “No plasmid” andCasM.19952 (SEQ ID NO: 23) were included as negative and positivecontrols, respectively. TABLE 34 shows the results of this experiment.TABLE 34 describes the sgRNA sequences with and without spacer testedand percent of reads with indels. Additionally, TABLE 34 shows thecomposition tested, and the effector protein Seq ID NO (under Enzyme SeqID NO). The results in TABLE 34 show these D2S enzymes are capable ofmodifying a genome in mammalian cells. Collectively, these guidestargeted PAM sequences as described in TABLE 35.

TABLE 34 Results of Indel experiment with D2S effectors Enzyme SEQ IndelComp. ID NO Percent sgRNA sequence with spacersgRNA sequence without spacer PL8080 220 1.506AAUAGGAUUUAUCCUAUGGGGCAGUUGGUUGC AAUAGGAUUUAUCCUAUGGGGCAGUUGGUUGCCCCUUAGCCUGAGGCAUUUAUUGCACUCGGGAA CUUAGCCUGAGGCAUUUAUUGCACUCGGGAAGUGUACCUUAUUUCAUUGAGCAACAGAAAGGGUA ACCUUAUUUCAUUGAGCAACAGAAAGGGUACAUCAUCCAACUCUAGGCGCCCGCUAAGUUC (SEQ CCAAC (SEQ ID NO: 737) ID NO: 736)PL8082 220 1.273 AAUAGGAUUUAUCCUAUGGGGCAGUUGGUUGCAAUAGGAUUUAUCCUAUGGGGCAGUUGGUUGCC CCUUAGCCUGAGGCAUUUAUUGCACUCGGGAACUUAGCCUGAGGCAUUUAUUGCACUCGGGAAGU GUACCUUAUUUCAUUGAGCAACAGAAAGGGUAACCUUAUUUCAUUGAGCAACAGAAAGGGUACAU CAUCCAACGGACAAAGUUUAGGGCGUCG (SEQCCAAC (SEQ ID NO: 737) ID NO 738) PL8083 220 1.287AAUAGGAUUUAUCCUAUGGGGCAGUUGGUUGC AAUAGGAUUUAUCCUAUGGGGCAGUUGGUUGCCCCUUAGCCUGAGGCAUUUAUUGCACUCGGGAA CUUAGCCUGAGGCAUUUAUUGCACUCGGGAAGUGUACCUUAUUUCAUUGAGCAACAGAAAGGGUA ACCUUAUUUCAUUGAGCAACAGAAAGGGUACAUCAUCCAACAUAAGCGUCAGAGCGCCGAG (SEQ CCAAC (SEQ ID NO: 737) ID NO 739)PL8086 220 0.861 AAUAGGAUUUAUCCUAUGGGGCAGUUGGUUGCAAUAGGAUUUAUCCUAUGGGGCAGUUGGUUGCC CCUUAGCCUGAGGCAUUUAUUGCACUCGGGAACUUAGCCUGAGGCAUUUAUUGCACUCGGGAAGU GUACCUUAUUUCAUUGAGCAACAGAAAGGGUAACCUUAUUUCAUUGAGCAACAGAAAGGGUACAU CAUCCAACCUCCGUGGCCUUAGCUGUGC (SEQCCAAC (SEQ ID NO: 737) ID NO 740) PL8087 220 9.254AAUAGGAUUUAUCCUAUGGGGCAGUUGGUUGC AAUAGGAUUUAUCCUAUGGGGCAGUUGGUUGCCCCUUAGCCUGAGGCAUUUAUUGCACUCGGGAA CUUAGCCUGAGGCAUUUAUUGCACUCGGGAAGUGUACCUUAUUUCAUUGAGCAACAGAAAGGGUA ACCUUAUUUCAUUGAGCAACAGAAAGGGUACAUCAUCCAACGAUGGAUGAAACCCAGACAC (SEQ CCAAC (SEQ ID NO: 737) ID NO 741)PL8090 220 3.132 AAUAGGAUUUAUCCUAUGGGGCAGUUGGUUGCAAUAGGAUUUAUCCUAUGGGGCAGUUGGUUGCC CCUUAGCCUGAGGCAUUUAUUGCACUCGGGAACUUAGCCUGAGGCAUUUAUUGCACUCGGGAAGU GUACCUUAUUUCAUUGAGCAACAGAAAGGGUACAUCCAACUGAUGAUUCUGCCCUCCUCC (SEQ ACCUUAUUUCAUUGAGCAACAGAAAGGGUACAUID NO 742) CCAAC (SEQ ID NO: 737) PL8091 220 9.643AAUAGGAUUUAUCCUAUGGGGCAGUUGGUUGC AAUAGGAUUUAUCCUAUGGGGCAGUUGGUUGCCCCUUAGCCUGAGGCAUUUAUUGCACUCGGGAA CUUAGCCUGAGGCAUUUAUUGCACUCGGGAAGUGUACCUUAUUUCAUUGAGCAACAGAAAGGGUA ACCUUAUUUCAUUGAGCAACAGAAAGGGUACAUCAUCCAACAGUACAUCUUCAAGCCAUCC (SEQ CCAAC (SEQ ID NO: 737) ID NO 743)PL8097 220 0.679 AAUAGGAUUUAUCCUAUGGGGCAGUUGGUUGCAAUAGGAUUUAUCCUAUGGGGCAGUUGGUUGCC CCUUAGCCUGAGGCAUUUAUUGCACUCGGGAACUUAGCCUGAGGCAUUUAUUGCACUCGGGAAGU GUACCUUAUUUCAUUGAGCAACAGAAAGGGUAACCUUAUUUCAUUGAGCAACAGAAAGGGUACAU CAUCCAACGACCUAAGGGAGAGCCAGGA (SEQCCAAC (SEQ ID NO: 737) ID NO 744) PL8100 220 1.225AAUAGGAUUUAUCCUAUGGGGCAGUUGGUUGC CCUUAGCCUGAGGCAUUUAUUGCACUCGGGAAGUACCUUAUUUCAUUGAGCAACAGAAAGGGUA CAUCCAACGGAAGAUUCUGAUGUGGAAA(SEQID NO 745) PL8133 233 0.531 GGGGCAGUUGGAUGCCCUUAUGCUGAGGGAUUGGGGCAGUUGGAUGCCCUUAUGCUGAGGGAUUA AUUCCACUCGGCAAGUACCAAUAAUAAUGGAUUUCCACUCGGCAAGUACCAAUAAUAAUGGAUGU GUGAAAAGGUACAUCCAACUGAGUGGGGCAGUGAAAAGGUACAUCCAAC (SEQ ID NO 747) GGGGGCG (SEQ ID NO 746) PL8150 23311.948 GGGGCAGUUGGAUGCCCUUAUGCUGAGGGAUUGGGGCAGUUGGAUGCCCUUAUGCUGAGGGAUUA AUUCCACUCGGCAAGUACCAAUAAUAAUGGAUUUCCACUCGGCAAGUACCAAUAAUAAUGGAUGU GUGAAAAGGUACAUCCAACUCGGGGGGCGGGGGAAAAGGUACAUCCAAC (SEQ ID NO 747) GGGAGAA (SEQ ID NO 748) PL8178 2400.553 CGGGUGGUUGCACAUCCGAAGGGUGAGGAUUU CGGGUGGUUGCACAUCCGAAGGGUGAGGAUUUAAUUCACUCACUAAUACUACAAAUGGAAAAAUU UUCACUCACUAAUACUACAAAUGGAAAAAUUUAUAAAGGAAAAUGUAAAUGCAACCUCACGUCAU AAGGAAAAUGUAAAUGCAAC (SEQ ID NO: 750)CCAGCAGAGA (SEQ ID NO: 749) PL8180 240 4.621CGGGUGGUUGCACAUCCGAAGGGUGAGGAUUU CGGGUGGUUGCACAUCCGAAGGGUGAGGAUUUAAUUCACUCACUAAUACUACAAAUGGAAAAAUU UUCACUCACUAAUACUACAAAUGGAAAAAUUUAUAAAGGAAAAUGUAAAUGCAACUUGUGCUGUA AAGGAAAAUGUAAAUGCAAC (SEQ ID NO: 750)GGAAGCUCAU (SEQ ID NO: 751) PL8185 240 3.863CGGGUGGUUGCACAUCCGAAGGGUGAGGAUUU CGGGUGGUUGCACAUCCGAAGGGUGAGGAUUUAAUUCACUCACUAAUACUACAAAUGGAAAAAUU UUCACUCACUAAUACUACAAAUGGAAAAAUUUAUAAAGGAAAAUGUAAAUGCAACAUGAGAGCAA AAGGAAAAUGUAAAUGCAAC (SEQ ID NO: 750)GUGGGCUGAU (SEQ ID NO: 752) PL8186 240 2.340CGGGUGGUUGCACAUCCGAAGGGUGAGGAUUU CGGGUGGUUGCACAUCCGAAGGGUGAGGAUUUAAUUCACUCACUAAUACUACAAAUGGAAAAAUU UUCACUCACUAAUACUACAAAUGGAAAAAUUUAUAAAGGAAAAUGUAAAUGCAACAGGUGGCAGC AAGGAAAAUGUAAAUGCAAC (SEQ ID NO: 750)GGCUUGAUCC (SEQ ID NO: 753) PL8187 240 3.144CGGGUGGUUGCACAUCCGAAGGGUGAGGAUUU CGGGUGGUUGCACAUCCGAAGGGUGAGGAUUUAAUUCACUCACUAAUACUACAAAUGGAAAAAUU UUCACUCACUAAUACUACAAAUGGAAAAAUUUAUAAAGGAAAAUGUAAAUGCAACGCCAAAGGCA AAGGAAAAUGUAAAUGCAAC (SEQ ID NO: 750)UGUGAGGUAC (SEQ ID NO: 754) PL8192 240 6.771CGGGUGGUUGCACAUCCGAAGGGUGAGGAUUU CGGGUGGUUGCACAUCCGAAGGGUGAGGAUUUAAUUCACUCACUAAUACUACAAAUGGAAAAAUU UUCACUCACUAAUACUACAAAUGGAAAAAUUUAUAAAGGAAAAUGUAAAUGCAACGGGCAGCUGG AAGGAAAAUGUAAAUGCAAC (SEQ ID NO: 750)UGGAAUUUUU (SEQ ID NO: 755 ) PL8194 240 12.361CGGGUGGUUGCACAUCCGAAGGGUGAGGAUUU CGGGUGGUUGCACAUCCGAAGGGUGAGGAUUUAAUUCACUCACUAAUACUACAAAUGGAAAAAUU UUCACUCACUAAUACUACAAAUGGAAAAAUUUAUAAAGGAAAAUGUAAAUGCAACCAGGUUGAGA AAGGAAAAUGUAAAUGCAAC (SEQ ID NO: 750)ACUUGUUGCU (SEQ ID NO: 756) PL8195 240 4.499CGGGUGGUUGCACAUCCGAAGGGUGAGGAUUU CGGGUGGUUGCACAUCCGAAGGGUGAGGAUUUAAUUCACUCACUAAUACUACAAAUGGAAAAAUU UUCACUCACUAAUACUACAAAUGGAAAAAUUUAUAAAGGAAAAUGUAAAUGCAACUCCCGACCCU AAGGAAAAUGUAAAUGCAAC (SEQ ID NO: 750)CCCGUCGCCG (SEQ ID NO: 757) PL8197 240 8.178CGGGUGGUUGCACAUCCGAAGGGUGAGGAUUU CGGGUGGUUGCACAUCCGAAGGGUGAGGAUUUAAUUCACUCACUAAUACUACAAAUGGAAAAAUU UUCACUCACUAAUACUACAAAUGGAAAAAUUUAUAAAGGAAAAUGUAAAUGCAACGGACGAGCCU AAGGAAAAUGUAAAUGCAAC (SEQ ID NO: 750)ACCCGUCCCC (SEQ ID NO: 758) PL8198 240 1.089CGGGUGGUUGCACAUCCGAAGGGUGAGGAUUU CGGGUGGUUGCACAUCCGAAGGGUGAGGAUUUAAUUCACUCACUAAUACUACAAAUGGAAAAAUU UUCACUCACUAAUACUACAAAUGGAAAAAUUUAUAAAGGAAAAUGUAAAUGCAACUCGGGGGGCG AAGGAAAAUGUAAAUGCAAC (SEQ ID NO: 750)GGGGGGAGAA (SEQ ID NO: 759) PL8216 16 0.941UGAAAUAUUGAUUGAGGUCGCCGUUUACGUUG UGAAAUAUUGAUUGAGGUCGCCGUUUACGUUGCCGUCACAAGGGCGCGCGGGCGACCGAAGGCCG GUCACAAGGGCGCGCGGGCGACCGAAGGCCGAUCAUCUGUACGGCCUGCAGGUUGAGAAGGCACAU UGUACGGCCUGCAGGUUGAGAAGGCACAUAUUAAUUAGAGGAAAAUUGCUUCCCUUUGUGUUCGC GAGGAAAAUUGCUUCCCUUUGUGUUCGCUCACCUCACCGAGUAUUCCUUGUUAUUUGCGGCAAGA GAGUAUUCCUUGUUAUUUGCGGCAAGAAACUGUAACUGUCUUAAUUGUUUGAAAGGGUGCAUACA CUUAAUUGUUUGAAAGGGUGCAUACAGG (SEQGGACCUCAAAUUCCUCCUCAGA (SEQ ID NO: ID NO: 761) 760) PL8240 14 0.620AAGCAACCGCGUACACGCGGACGAACGGCCGA AAGCAACCGCGUACACGCGGACGAACGGCCGACCCCUGCUCGGCCUGAAGGUUGAGAAGGUUAUGU UGCUCGGCCUGAAGGUUGAGAAGGUUAUGUAUAAUAAGAGGAGAAAAUCCCCCUUCAUAAUCGCU AGAGGAGAAAAUCCCCCUUCAUAAUCGCUCACCACACCAAGCUCCCAAUUUACAUAUUUUGAAAGG AGCUCCCAAUUUACAUAUUUUGAAAGGGCGCAUGCGCAUGCAGGACCUCAAAUUCCUCCUCAGA GCAGG (SEQ ID NO: 763) (SEQ ID NO: 762)PL8252 15 0.693 UAUUGCGCUAGCCAUAAUGGCAAUCGCGUACAUAUUGCGCUAGCCAUAAUGGCAAUCGCGUACAG GGCAACUGAAGGCCGACCUGUACGGCCUUAAGGCAACUGAAGGCCGACCUGUACGGCCUUAAGGU GUUGAGAAGGCACAUGUAAGUGGAAAAAUGCUUGAGAAGGCACAUGUAAGUGGAAAAAUGCUUUC UUCCCGUUGUGUUCGCUCACCAAGCACACACGCCGUUGUGUUCGCUCACCAAGCACACACGUUUGA UUUGAAAUGUGGGGUGCUUACAGGAUCCAACAAAUGUGGGGUGCUUACAGG (SEQ ID NO: 765) GCCAGGGGGACU (SEQ ID NO: 764)PL8253 15 1.435 UAUUGCGCUAGCCAUAAUGGCAAUCGCGUACAUAUUGCGCUAGCCAUAAUGGCAAUCGCGUACAG GGCAACUGAAGGCCGACCUGUACGGCCUUAAGGCAACUGAAGGCCGACCUGUACGGCCUUAAGGU GUUGAGAAGGCACAUGUAAGUGGAAAAAUGCUUGAGAAGGCACAUGUAAGUGGAAAAAUGCUUUC UUCCCGUUGUGUUCGCUCACCAAGCACACACGCCGUUGUGUUCGCUCACCAAGCACACACGUUUGA UUUGAAAUGUGGGGUGCUUACAGGAUCCUGUGAAUGUGGGGUGCUUACAGG (SEQ ID NO: 765) UGCCCCUGAUGC (SEQ ID NO: 766)PL8264 15 0.543 UAUUGCGCUAGCCAUAAUGGCAAUCGCGUACAUAUUGCGCUAGCCAUAAUGGCAAUCGCGUACAG GGCAACUGAAGGCCGACCUGUACGGCCUUAAGGCAACUGAAGGCCGACCUGUACGGCCUUAAGGU GUUGAGAAGGCACAUGUAAGUGGAAAAAUGCUUGAGAAGGCACAUGUAAGUGGAAAAAUGCUUUC UUCCCGUUGUGUUCGCUCACCAAGCACACACGCCGUUGUGUUCGCUCACCAAGCACACACGUUUGA UUUGAAAUGUGGGGUGCUUACAGGACCUCAAAAAUGUGGGGUGCUUACAGG (SEQ ID NO: 765) UUCCUCCUCAGA (SEQ ID NO: 767)PL8272 239 3.642 AGUAUGAGGCCGCCGAUAAACGUUUCGCUAGCAGUAUGAGGCCGCCGAUAAACGUUUCGCUAGCC CUGACAGGCAAUCGCGAACGGGCGGCUGAAGGUGACAGGCAAUCGCGAACGGGCGGCUGAAGGCC CCGACCUGUACGGCCUGAAGGAUGAGAAGGCAGACCUGUACGGCCUGAAGGAUGAGAAGGCACAU CAUAUAAGUGGAAAAUUGCUUCCCGUUGUGUUAUAAGUGGAAAAUUGCUUCCCGUUGUGUUCGCU CGCUCACCAGGUACUCCUUAAUUUGAAAGCUGCACCAGGUACUCCUUAAUUUGAAAGCUGCAAGA CAAGAGCUCCUAAUUUGAGGGGUGCAUACAGGGCUCCUAAUUUGAGGGGUGCAUACAGG (SEQ IDAGAAAGAGAGAGUAGCGCGA (SEQ ID NO: 768) NO: 769) PL8287 239 0.995AGUAUGAGGCCGCCGAUAAACGUUUCGCUAGC AGUAUGAGGCCGCCGAUAAACGUUUCGCUAGCCCUGACAGGCAAUCGCGAACGGGCGGCUGAAGG UGACAGGCAAUCGCGAACGGGCGGCUGAAGGCCCCGACCUGUACGGCCUGAAGGAUGAGAAGGCA GACCUGUACGGCCUGAAGGAUGAGAAGGCACAUCAUAUAAGUGGAAAAUUGCUUCCCGUUGUGUU AUAAGUGGAAAAUUGCUUCCCGUUGUGUUCGCUCGCUCACCAGGUACUCCUUAAUUUGAAAGCUG CACCAGGUACUCCUUAAUUUGAAAGCUGCAAGACAAGAGCUCCUAAUUUGAGGGGUGCAUACAGG GCUCCUAAUUUGAGGGGUGCAUACAGG (SEQ IDUACUAUGGGAUCAAGCCGCU (SEQ ID NO: 770) NO: 769) PL8288 239 1.598AGUAUGAGGCCGCCGAUAAACGUUUCGCUAGC AGUAUGAGGCCGCCGAUAAACGUUUCGCUAGCCCUGACAGGCAAUCGCGAACGGGCGGCUGAAGG UGACAGGCAAUCGCGAACGGGCGGCUGAAGGCCCCGACCUGUACGGCCUGAAGGAUGAGAAGGCA GACCUGUACGGCCUGAAGGAUGAGAAGGCACAUCAUAUAAGUGGAAAAUUGCUUCCCGUUGUGUU AUAAGUGGAAAAUUGCUUCCCGUUGUGUUCGCUCGCUCACCAGGUACUCCUUAAUUUGAAAGCUG CACCAGGUACUCCUUAAUUUGAAAGCUGCAAGACAAGAGCUCCUAAUUUGAGGGGUGCAUACAGG GCUCCUAAUUUGAGGGGUGCAUACAGG (SEQ IDACCUCAAAUUCCUCCUCAGA (SEQ ID NO: 771) NO: 769) PL8369 232 5.619AACUGCCGGUAAGAUUACGAUAGCCGAAAGGC AACUGCCGGUAAGAUUACGAUAGCCGAAAGGCAAAUUGCGUAUGCGGCAGUUAAGGCCGGCUCGA AUUGCGUAUGCGGCAGUUAAGGCCGGCUCGAACACGGCCUGAAGGUUGAGUUUAAAGUCACAUAU GGCCUGAAGGUUGAGUUUAAAGUCACAUAUAAGAAGCGGAAAAAUCAGAUUUCCCAUUGUGUUCG CGGAAAAAUCAGAUUUCCCAUUGUGUUCGCUCACUCACCAAUACGCGCAAAUUUGAAAAUGUAGU CCAAUACGCGCAAAUUUGAAAAUGUAGUUCGAGUCGAGGUCCAGGCCUAAGGAAGGAGU (SEQ ID G (SEQ ID NO: 773) NO: 772) PL8375232 5.505 AACUGCCGGUAAGAUUACGAUAGCCGAAAGGCAACUGCCGGUAAGAUUACGAUAGCCGAAAGGCA AAUUGCGUAUGCGGCAGUUAAGGCCGGCUCGAAUUGCGUAUGCGGCAGUUAAGGCCGGCUCGAAC ACGGCCUGAAGGUUGAGUUUAAAGUCACAUAUGGCCUGAAGGUUGAGUUUAAAGUCACAUAUAAG AAGCGGAAAAAUCAGAUUUCCCAUUGUGUUCGCGGAAAAAUCAGAUUUCCCAUUGUGUUCGCUCA CUCACCAAUACGCGCAAAUUUGAAAAUGUAGUCCAAUACGCGCAAAUUUGAAAAUGUAGUUCGAG UCGAGGUUGGUGAAGUAGGGCCUCCU (SEQ IDG (SEQ ID NO: 773) NO: 774) PL8378 232 0.994AACUGCCGGUAAGAUUACGAUAGCCGAAAGGC AACUGCCGGUAAGAUUACGAUAGCCGAAAGGCAAAUUGCGUAUGCGGCAGUUAAGGCCGGCUCGA AUUGCGUAUGCGGCAGUUAAGGCCGGCUCGAACACGGCCUGAAGGUUGAGUUUAAAGUCACAUAU GGCCUGAAGGUUGAGUUUAAAGUCACAUAUAAGAAGCGGAAAAAUCAGAUUUCCCAUUGUGUUCG CGGAAAAAUCAGAUUUCCCAUUGUGUUCGCUCACUCACCAAUACGCGCAAAUUUGAAAAUGUAGU UCGAGGAAUUCCGGGUAUCCCAGGAG (SEQ IDCCAAUACGCGCAAAUUUGAAAAUGUAGUUCGAG NO: 775) G (SEQ ID NO: 773) PL8379 2320.767 AACUGCCGGUAAGAUUACGAUAGCCGAAAGGC AACUGCCGGUAAGAUUACGAUAGCCGAAAGGCAAAUUGCGUAUGCGGCAGUUAAGGCCGGCUCGA AUUGCGUAUGCGGCAGUUAAGGCCGGCUCGAACACGGCCUGAAGGUUGAGUUUAAAGUCACAUAU GGCCUGAAGGUUGAGUUUAAAGUCACAUAUAAGAAGCGGAAAAAUCAGAUUUCCCAUUGUGUUCG CGGAAAAAUCAGAUUUCCCAUUGUGUUCGCUCACUCACCAAUACGCGCAAAUUUGAAAAUGUAGU CCAAUACGCGCAAAUUUGAAAAUGUAGUUCGAGUCGAGGUUCAUUGCAGAAAGAGACAU (SEQ ID G (SEQ ID NO: 773) NO: 776) PL8383232 0.505 AACUGCCGGUAAGAUUACGAUAGCCGAAAGGCAACUGCCGGUAAGAUUACGAUAGCCGAAAGGCA AAUUGCGUAUGCGGCAGUUAAGGCCGGCUCGAAUUGCGUAUGCGGCAGUUAAGGCCGGCUCGAAC ACGGCCUGAAGGUUGAGUUUAAAGUCACAUAUGGCCUGAAGGUUGAGUUUAAAGUCACAUAUAAG AAGCGGAAAAAUCAGAUUUCCCAUUGUGUUCGCGGAAAAAUCAGAUUUCCCAUUGUGUUCGCUCA CUCACCAAUACGCGCAAAUUUGAAAAUGUAGUCCAAUACGCGCAAAUUUGAAAAUGUAGUUCGAG UCGAGGAGAUCACGAGGAAUACAACA(SEQ IDG (SEQ ID NO: 773) NO: 777) PL8386 232 3.165AACUGCCGGUAAGAUUACGAUAGCCGAAAGGC AACUGCCGGUAAGAUUACGAUAGCCGAAAGGCAAAUUGCGUAUGCGGCAGUUAAGGCCGGCUCGA AUUGCGUAUGCGGCAGUUAAGGCCGGCUCGAACACGGCCUGAAGGUUGAGUUUAAAGUCACAUAU GGCCUGAAGGUUGAGUUUAAAGUCACAUAUAAGAAGCGGAAAAAUCAGAUUUCCCAUUGUGUUCG CGGAAAAAUCAGAUUUCCCAUUGUGUUCGCUCACUCACCAAUACGCGCAAAUUUGAAAAUGUAGU CCAAUACGCGCAAAUUUGAAAAUGUAGUUCGAGUCGAGGCAGCCGGGAGGAGCAGCAAG (SEQ ID G (SEQ ID NO: 773) NO: 778) PL8427231 0.832 ACCGAGGCCGCGAAAAACACAACGCUAGCCGAACCGAGGCCGCGAAAAACACAACGCUAGCCGAAA AAGGCAAUCGCGGGUGCGCGGCCGAAGGCCGAGGCAAUCGCGGGUGCGCGGCCGAAGGCCGACUA CUAGAGCGGCCUGAAGGUUGAGAAGCGUGCAUGAGCGGCCUGAAGGUUGAGAAGCGUGCAUGUAA GUAAACGGCAGAAAAAAUGCCUUUUGUACGCGACGGCAGAAAAAAUGCCUUUUGUACGCGCUCAC CUCACCGAACACGUCUGAGCGGUUUGAAAGGUCGAACACGUCUGAGCGGUUUGAAAGGUGUGCUC GUGCUCUAGGACUAUGGGAUCAAGCCGCUGUAGG (SEQ ID NO: 780) (SEQ ID NO: 779) PL5995 228 23.175GGGGUUGUUGGAAACCCUUAUGCUGAGGGAUU GGGGUUGUUGGAAACCCUUAUGCUGAGGGAUUAAUUCCACUCGGUAAGUACCUUAAAUAGUUAUA UUCCACUCGGUAAGUACCUUAAAUAGUUAUAGAGAAAGAUGUAAAUCAUCUAUAAAAGAAAGGUA AAGAUGUAAAUCAUCUAUAAAAGAAAGGUACAUCAUCCAACGCCUGGAGGCUAUCCAGCGU (SEQ CCAAC (SEQ ID NO: 782) ID NO: 781)PL6002 228 0.564 GGGGUUGUUGGAAACCCUUAUGCUGAGGGAUUGGGGUUGUUGGAAACCCUUAUGCUGAGGGAUUA AUUCCACUCGGUAAGUACCUUAAAUAGUUAUAUUCCACUCGGUAAGUACCUUAAAUAGUUAUAGA GAAAGAUGUAAAUCAUCUAUAAAAGAAAGGUAAAGAUGUAAAUCAUCUAUAAAAGAAAGGUACAU CAUCCAACACUUUCCAUUCUCUGCUGGA(SEQCCAAC (SEQ ID NO: 782) ID NO: 783) PL8069 213 2.442AAGAUAUGAAUAGGAGUAUUCCUAUGGGGCAG AAGAUAUGAAUAGGAGUAUUCCUAUGGGGCAGUUUGGUUGCCCUUAGCCUGAGGUAUUUAAUGCA UGGUUGCCCUUAGCCUGAGGUAUUUAAUGCACUCUCGGGAAGUACUUUCAACAGUAUCCGUUAGA CGGGAAGUACUUUCAACAGUAUCCGUUAGAAAAAAAGGUACAUCCAACGUGUUGCUGGAGGGGGC GGUACAUCCAAC (SEQ ID NO: 785)CUU (SEQ ID NO: 784)

TABLE 35 D2S effectors and targets PAM sequences Enzyme SEQ ID NOPAM sequence(s) 220 TCG (SEQ ID NO: 156) 233TTR (SEQ ID NO: 786); TR (SEQ ID NO: 787) 240TTR (SEQ ID NO: 786); TTTR (SEQ ID NO: 788) 16 CC (SEQ ID NO: 155) 14CC (SEQ ID NO: 155) 15 CC (SEQ ID NO: 155) 239 CC (SEQ ID NO: 155) 232TTTYC (SEQ ID NO: 789) 231 CCN (SEQ ID NO: 790) 228TG (SEQ ID NO: 791); TNTG (SEQ ID NO: 368) 213 GGTYG (SEQ ID NO: 792)

Example 23: Effector Protein Tags

CasM.19952 (SEQ ID NO: 23) was purified with a TEV-cleavable MBP tag,which has the TEV cleavage site of ENLYFQSNA (SEQ ID NO: 811). Proteinspurified with a TEV-cleavable MBP tag may be useful for variousapplications, including but not limited to modifying a cell ex vivo. TEVcleavage typically happens before it is introduced in the cell. AfterTEV cleavage, the protein's N terminus retains the three additionalamino acids (SerAsnAla; SNA). This is true regardless of whether NLSsare also present.

Similarly, effector proteins with different tags including T2A, His,FLAG and GFP, were developed for various purposes. Exemplary sequencesare described in Tables 36 and 37. In particular, examples of the taggedconstructs are shown in Table 36 and individual components of taggedconstructs are shown in Table 37. The components of the taggedconstructs shown in Table 37 can be applied to any D2S effector proteindisclosed herein for example to SEQ ID NOs: 1-45, 202-293, or 728-731.

TABLE 36 Tagged Construct Examples SEQ ID Description NO:Amino Acid Sequence Full Uncleaved TEV- 812MKSSHHHHHHHHHHGSSMKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIKVTVEHCleavable and MBPPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTWDAVtag sequence ofRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEPYFT CasM.19952WPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAEAAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLSAGINAASPNKELAKEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIPQMSAFWYAVRTAVINAASGRQTVDEALKDAQTNSSSNNNNNNNNNNLGIEENLYFQSNAMPTITRKIELTLLTEGLSEEQRKEQWGLLYHINDNLYKAANNISSKLYLDDHVSSMVRMKHAEYLSLLKELARAEKQKTPDADAIAELRKKVAAAEKEMTDQEHAICKYATEMSTQSLSYRFATELETNIFAKILDCLKQGVFATFNSDARDVKRGERAIRNYKKGMPIPFAWDKSLRIEKDNKDFYLRWYNGLRFLFNFGKDRSNNRLIVERCLKMDADYDGEYKLCNSSIQIAKREGKTKLFLLLVVKIPQEHVELNKKVVVGVDLGINVPAYVATNITEERKAIGDREHFLNSRMAFQRRYKSLQRLRGTAGGKGRAKKLEPLERLRKAEHNWVHTQNHLFSREVVDFAVKSHAATIHMEDLSGFGKDNDGNADERKEFVLRNWSYYELQNMIAYKAAKYGIKVEKIHPAYTSKTCSWCGQLGFREGVTF1CENPECKQCGEKVHADYNAARNLAN SKDIIKKNEFull cleaved TEV- 813SNAMPTITRKIELTLLTEGLSEEQRKEQWGLLYHINDNLYKAANNISSKLYLDDHVSSMCleavable and MBPVRMKHAEYLSLLKELARAEKQKTPDADAIAELRKKVAAAEKEMTDQEHAICKYATEMtag sequence ofSTQSLSYRFATELETNIFAKILDCLKQGVFATFNSDARDVKRGERAIRNYKKGMPIPFA CasM.19952WDKSLRIEKDNKDFYLRWYNGLRFLFNFGKDRSNNRLIVERCLKMDADYDGEYKLCNSSIQIAKREGKTKLFLLLVVKIPQEHVELNKKVVVGVDLGINVPAYVATNITEERKAIGDREHFLNSRMAFQRRYKSLQRLRGTAGGKGRAKKLEPLERLRKAEHNWVHTQNHLFSREVVDFAVKSHAATIHMEDLSGFGKDNDGNADERKEFVLRNWSYYELQNMIAYKAAKYGIKVEKIHPAYTSKTCSWCGQLGFREGVTFICENPECKQCGEKVHADYNAARNIANS KDIIKKNEFull sequence of His 814MKSSHHHHHHHGSSMPTITRKIELTLLTEGLSEEQRKEQWGLLYHINDNLYKAANNISSand GFP tagged KLYLDDHVSSMVRMKHAEYLSLLKELARAEKQKTPDADAIAELRKKVAAAEKEMTDCasM.19952 QEHAICKYATEMSTQSLSYRFATELETNIFAKILDCLKQGVFATFNSDARDVKRGERAIRNYKKGMPIPFAWDKSLRIEKDNKDFYLRWYNGLRFLFNFGKDRSNNRLIVERCLKMDADYDGEYKLCNSSIQIAKREGKTKLFLLLVVKIPQEHVELNKKVVVGVDLGINVPAYVATNITEERKAIGDREHFLNSRMAFQRRYKSLQRLRGTAGGKGRAKKLEPLERLRKAEHNWVHTQNHLFSREVVDFAVKSHAATIHMEDLSGFGKDNDGNADERKEFVLRNWSYYELQNMIAYKAAKYGIKVEKIHPAYTSKTCSWCGQLGFREGVTFICENPECKQCGEKVHADYNAARNIANSKDIIKKNEGSDGGSGGGSTSRDHMVLHEYVNAAGIT Full uncleaved 815MDYKDHDGDYKDHDIDYKDDDDKMAPKKKRKVGIHGVPAAMPTITRKIELTLLTEGLsequence of T2ASEEQRKEQWGLLYHINDNLYKAANNISSKLYLDDHVSSMVRMKHAEYLSLLKELARAtagged CasM.19952EKQKTPDADAIAELRKKVAAAEKEMTDQEHAICKYATEMSTQSLSYRFATELETNIFAKILDCLKQGVFATFNSDARDVKRGERAIRNYKKGMPIPFAWDKSLRIEKDNKDFYLRWYNGLRFLFNFGKDRSNNRLIVERCLKMDADYDGEYKLCNSSIQIAKREGKTKLFLLLVVKIPQEHVELNKKVVVGVDLGINVPAYVATNITEERKAIGDREHFLNSRMAFQRRYKSLQRLRGTAGGKGRAKKLEPLERLRKAEHNWVHTQNHLFSREVVDFAVKSHAATIHMEDLSGFGKDNDGNADERKEFVLRNWSYYELQNMIAYKAAKYGIKVEKIHPAYTSKTCSWCGQLGFREGVTFICENPECKQCGEKVHADYNAARNIANSKDIIKKNEKRPAATKKAGQAKKKKEFGSGEGRGSLLTCGDVEENPGP Cleaved sequence of 816MDYKDHDGDYKDHDIDYKDDDDKMAPKKKRKVGIHGVPAAMPTITRKIELTLLTEGLT2A tagged CasM.19952SEEQRKEQWGLLYHINDNLYKAANNISSKLYLDDHVSSMVRMKHAEYLSLLKELARAEKQKTPDADAIAELRKKVAAAEKEMTDQEHAICKYATEMSTQSLSYRFATELETNIFAKILDCLKQGVFATFNSDARDVKRGERAIRNYKKGMPIPFAWDKSLRIEKDNKDFYLRWYNGLRFLFNFGKDRSNNRLIVERCLKMDADYDGEYKLCNSSIQIAKREGKTKLFLLLVVKIPQEHVELNKKVVVGVDLGINVPAYVATNITEERKAIGDREHFLNSRMAFQRRYKSLQRLRGTAGGKGRAKKLEPLERLRKAEHNWVHTQNHLFSREVVDFAVKSHAATIHMEDLSGFGKDNDGNADERKEFVLRNWSYYELQNMIAYKAAKYGIKVEKIHPAYTSKTCSWCGQLGFREGVTFICENPECKQCGEKVHADYNAARNIANSKDIIKKNEKRPAATKKAGQAKKKKEFGSGEGRGSLLTCGDVEENPG

TABLE 37 Components of Tagged Constructs SEQ ID Description NO:Amino Acid Sequence N-terminus sequence 817MKSSHHHHHHHHHHGSSMKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIKVTVEHof TEV-cleavablePDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTWDAVRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEPYFTMBP tag before WPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAECasM.19952 AAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLSAGINAASPNKELAKEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIPQMSAFWYAVRTAVINAASGRQTVDEALKDAQTNSSSNNNNNNNNNNLGIEENLY FQSNA10X His tag 818 HHHHHHHHHH MBP tag 819MKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIKVTVEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTWDAVRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEPYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAEAAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLSAGINAASPNKELAKEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIPQMSAFWYAVRTAVINAASGRQTVDEALKDAQT N-terminus His6 tag 820 MKSSHHHHHHHGSSplus linker before CasM.19952 C-terminus Linker- 821GSDGGSGGGSTSRDHMVLHEYVNAAGIT GFP11 tag after CasM.19952N terminus of T2A 822 MDYKDHDGDYKDHDIDYKDDDDKMAPKKKRKVGIHGVPAAtagged effector protein 3x FLAG tag of N 823 MDYKDHDGDYKDHDIDYKDDDDKterminus of T2A tagged effector protein SV40 NLS sequence 712 PKKKRKVofN terminus of T2A tagged effector protein C terminus of T2A 824KRPAATKKAGQAKKKKEFGSGEGRGSLLTCGDVEENPGP tagged effector proteinNLS (nucleoplasmin) 825 KRPAATKKAGQAKKKK of C terminus of T2Atagged effector protein T2A self-cleaving 826 GSGEGRGSLLTCGDVEENPGPpeptide sequence

Example 24: CasM.19952 Demonstrates Blunt Cutting of dsDNA

A CasM.19952 (SEQ ID NO: 23) sgRNA complex (200 nM) was incubated with atarget nucleic acid having a PAM of GTCG (10 nM) at 37 degrees Celsiusfor 1 hour in CutSmart buffer. Purified and amplified fragments weresubjected to Sanger sequencing using multiple forward and reverseprimers to read both the target and non target strands. FIG. 12 showsthe sequencing reads which were interpreted as blunt cutting.

Example 25: D2S Sequence Similarity

The following method was used to calculate the similarity of D2S enzymesdisclosed herein to CasM.19952, as well as the similarity of sequenceswithin each D2S enzyme sequence to the multilevel consensussequence/PROSITE motifs described in Example 21.

The BLOSUM62 similarity matrix (Henikoff & Henikoff, 1992) wastransformed so that any value ≥1 was replaced with +1 and any value ≤0was replaced with 0. For example, the Ile to Leu substitution is scoredat +2.0; in the transformed matrix, it is scored at +1. Thistransformation allows the calculation of percent similarity, rather thana similarity score.

For similarity over the MEME motifs, the multilevel consensus sequence(or PROSITE motif sequence) was used to identify how strongly each motifwas conserved. In calculating the similarity of a motif sequence, thesecond and third levels of the multilevel sequence were treated asequivalent to the top level. Alternately, when comparing two fullprotein sequences, the proteins were aligned using pairwise MUSCLEalignment. Then, the similarity was scored at each residue and dividedby the length of the alignment.

If a substitution could be treated as conservative with any of the aminoacids in that position of the multilevel consensus sequence, +1 pointwas assigned. For example, given the multilevel consensus sequence:

RLG

YCK

. . . the test sequence QIQ would receive three points. This is becausein the transformed BLOSUM62 matrix, each combination is scored as: Q-R:+1; Q-Y: +0; I-L: +1; I-C: +0; Q-G: +0; Q-K: +1For each position, the highest score is used when calculatingsimilarity.

The score over the length of the motif was divided by the length of themotif to provide the % similarity. In the example above, the %similarity would be 100%. This process is equivalent to the percentsimilarity calculation used by the Geneious Prime software given theparameters matrix=BLOSUM62 and threshold ≥1.

As shown in Table 41, there are 24 D2S enzymes with greater than 70%similarity to CasM.19952. Including CasM.19952, there are 26 sequencesthat have greater than 80% similarity to six or more of the MEME motifs,as shown into Table 42. Of these, 19 (excluding CasM.19952 itself) havegreater than 80% similarity to the MEME motifs of CasM.19952. These arethe same 19 sequences with at least 75% identity to CasM.19952 overall.

TABLE 41 D2S Effector Protein Sequence Similarity Effector Protein Name% similarity to CasM.19952 CasM.19952  100.0 CasM.288480 97.9CasM.274559 96.8 CasM.272451 95.9 CasM.289206 92.3 CasM.289248 92.3CasM.290598 92.9 CasM.287826 92.1 CasM.294406 89.8 CasM.286251 90.2CasM.290816 90.4 CasM.287936 90.0 CasM.270012 91.2 CasM.295231 90.9CasM.19498  87.4 CasM.288450 90.8 CasM.19948  88.5 CasM.279423 89.5CasM.295071 86.4 CasM.288668 86.1 CasM.285333 78.0 CasM.290380 76.3CasM.287128 76.0 CasM.286678 75.7 CasM.19924  71.6 CasM.292139 65.1CasM.265291 58.9 CasM.296640 60.4 CasM.288712 59.0 CasM.294190 57.5CasM.299584 57.0 CasM.298446 50.6

TABLE 42 D2S MEME motif percent similarity Effector Protein MEME_1MEME_2 MEME_3 MEME_4 MEME_5 MEME_6 MEME_7 CasM.19952 96.0 96.6 100.0100.0 95.2 80.0 87.0 CasM.288480 96.0 96.6 100.0 100.0 95.2 80.0 87.0CasM.274559 94.0 93.1 95.2 100.0 95.2 80.0 87.0 CasM.272451 94.0 93.1100.0 100.0 100.0 80.0 87.0 CasM.289206 92.0 93.1 100.0 100.0 100.0100.0 91.3 CasM.289248 94.0 93.1 100.0 100.0 100.0 100.0 91.3CasM.290598 96.0 93.1 100.0 100.0 100.0 76.7 82.6 CasM.287826 96.0 93.1100.0 100.0 100.0 100.0 87.0 CasM.294406 94.0 96.6 95.2 97.6 95.2 93.387.0 CasM.286251 94.0 93.1 95.2 97.6 100.0 100.0 87.0 CasM.290816 96.089.7 100.0 95.1 95.2 100.0 91.3 CasM.287936 94.0 93.1 95.2 97.6 100.0100.0 87.0 CasM.270012 92.0 93.1 100.0 100.0 100.0 100.0 87.0CasM.295231 96.0 89.7 100.0 95.1 95.2 100.0 87.0 CasM.19498 92.0 93.1100.0 100.0 100.0 100.0 91.3 CasM.288450 92.0 89.7 100.0 95.1 95.2 86.791.3 CasM.19948 94.0 93.1 100.0 95.1 100.0 100.0 87.0 CasM.279423 96.093.1 100.0 97.6 100.0 100.0 87.0 CasM.295071 94.0 93.1 95.2 97.6 100.0100.0 87.0 CasM.288668 94.0 93.1 95.2 97.6 100.0 93.3 91.3 CasM.28533364.0 86.2 95.2 92.7 95.2 93.3 91.3 CasM.290380 80.0 89.7 100.0 92.7 90.586.7 95.7 CasM.287128 80.0 86.2 100.0 95.1 95.2 93.3 95.7 CasM.28667882.0 89.7 100.0 92.7 95.2 86.7 95.7 CasM.19924 100.0 86.2 90.5 68.3 90.586.7 91.3 CasM.292139 86.0 82.8 90.5 61.0 90.5 93.3 95.7 CasM.26529174.0 72.4 76.2 68.3 85.7 80.0 69.6 CasM.296640 76.0 75.9 85.7 53.7 85.780.0 73.9 CasM.294190 74.0 69.0 76.2 70.7 71.4 80.0 69.6 CasM.28871278.0 75.9 85.7 52.4 81.0 80.0 73.9 CasM.299584 84.0 69.0 85.7 58.5 95.280.0 69.6 CasM.298446 84.0 86.2 76.2 71.4 93.3 73.9 CasM.289802 58.661.9 43.9 66.7 60.9 CasM.286285 58.6 53.7 81.0 69.6 CasM.20054 88.0 82.890.5 76.2 93.3 78.3 CasM.284933 80.0 89.7 85.7 76.2 93.3 73.9CasM.289726 58.6 57.1 46.3 61.9 65.2 CasM.294537 88.0 79.3 95.2 81.080.0 69.6 CasM.295929 86.0 82.8 90.5 76.2 93.3 78.3 CasM.298538 82.075.9 95.2 41.5 81.0 80.0 73.9 CasM.286588 82.0 79.3 76.2 36.6 76.2 86.765.2 CasM.19910 90.0 75.9 95.2 81.0 86.7 69.6 CasM.291449 84.0 75.9 90.581.0 86.7 73.9 CasM.293576 86.0 75.9 95.2 85.7 80.0 65.2 CasM.28789684.0 82.8 90.5 81.0 70.0 69.6 CasM.293410 90.0 82.8 90.5 31.7 81.0 93.378.3 CasM.295187 90.0 82.8 90.5 31.7 81.0 93.3 78.3 CasM.297599 86.079.3 95.2 85.7 86.7 63.0 CasM.286910 88.0 79.3 95.2 81.0 80.0 69.6CasM.296642 92.0 79.3 95.2 71.4 93.3 73.9 CasM.298612 82.0 79.3 95.281.0 80.0 65.2 CasM.274429 90.0 75.9 90.5 41.5 81.0 93.3 69.6CasM.282673 88.0 82.8 90.5 85.7 86.7 78.3 CasM.294601 72.0 72.4 76.260.9 CasM.294270 86.0 93.1 66.7 76.2 100.0 65.2 CasM.295105 90.0 89.795.2 81.0 93.3 73.9 CasM.19548 80.0 75.9 95.2 39.0 81.0 86.7 69.6CasM.287908 86.0 96.6 85.7 81.0 93.3 78.3 CasM.291507 86.0 86.2 95.285.7 93.3 73.9 CasM.283262 90.0 89.7 95.2 81.0 93.3 73.9 CasM.29520188.0 79.3 95.2 85.7 93.3 65.2 CasM.284833 86.0 86.2 90.5 81.0 93.3 73.9CasM.294655 88.0 89.7 90.5 85.7 86.7 69.6 CasM.277328 82.0 93.1 90.585.7 93.3 73.9 CasM.292335 84.0 82.8 85.7 41.5 85.7 80.0 69.6CasM.294491 86.0 93.1 85.7 81.0 86.7 78.3 CasM.293203 88.0 75.9 90.590.5 86.7 73.9 CasM.287700 88.0 89.7 95.2 81.0 100.0 73.9 CasM.28085266.0 72.4 76.2 93.3 60.9 CasM.293891 80.0 96.6 85.7 81.0 93.3 69.6CasM.281060 84.0 93.1 66.7 76.2 100.0 73.9 CasM.299588 86.0 82.8 90.576.2 100.0 69.6 CasM.288518 82.0 93.1 90.5 76.2 70.0 73.9 CasM.28060484.0 89.7 90.5 85.7 78.3 CasM.298706 88.0 89.7 76.2 71.4 93.3 73.9CasM.281050 88.0 75.9 95.2 76.2 93.3 69.6 CasM.277378 86.0 86.2 90.581.0 93.3 73.9 CasM.297894 88.0 89.7 76.2 71.4 93.3 73.9 CasM.29504780.0 89.7 85.7 76.2 93.3 73.9 CasM.282952 88.0 89.7 85.7 85.7 80.0 78.3CasM.298142 66.0 69.0 81.0 60.9 CasM.292901 72.0 72.4 59.5 60.9CasM.298264 52.4 38.1

While preferred embodiments of the present invention have been shown anddescribed herein, it will be obvious to those skilled in the art thatsuch embodiments are provided by way of example only. Numerousvariations, changes, and substitutions will now occur to those skilledin the art without departing from the invention. It should be understoodthat various alternatives to the embodiments of the invention describedherein may be employed in practicing the invention. It is intended thatthe following claims define the scope of the invention and that methodsand structures within the scope of these claims and their equivalents becovered thereby.

1.-54. (canceled)
 55. A system comprising: a) an effector protein, or anucleic acid encoding an effector protein, wherein the effector proteincomprises an amino acid sequence that is at least 90% identical to SEQID NO: 796 (MEME_4); and b) an engineered guide nucleic acid or a DNAmolecule encoding the engineered guide nucleic acid.
 56. The system ofclaim 55, wherein the length of the effector protein is 400 amino acidsto 700 amino acids.
 57. The system of claim 55, wherein the effectorprotein comprises an additional amino acid sequence that is at least 90%identical to at least one of SEQ ID NO: 793 (MEME_1), SEQ ID NO: 795(MEME_3), and SEQ ID NO: 798 (MEME_6).
 58. The system of claim 55,wherein the effector protein comprises an additional amino acid sequencethat is at least 90% identical to at least one of SEQ ID NO: 793(MEME_1), SEQ ID NO: 794 (MEME_2), SEQ ID NO: 795 (MEME_3), SEQ ID NO:797 (MEME_5), SEQ ID NO: 798 (MEME_6), and SEQ ID NO: 799 (MEME_7). 59.The system of claim 55, wherein the effector protein comprises an aminoacid sequence that is at least 70% identical to each of SEQ ID NO: 793(MEME_1), SEQ ID NO: 794 (MEME_2), SEQ ID NO: 795 (MEME_3), SEQ ID NO:797 (MEME_5), SEQ ID NO: 798 (MEME_6), and SEQ ID NO: 799 (MEME_7). 60.The system of claim 55, wherein the amino acid sequence of the effectorprotein is at least 70% identical to any one of SEQ ID NOs: 23-34. 61.The system of claim 55, wherein the amino acid sequence of the effectorprotein is at least 80% identical to any one of SEQ ID NOs: 23-34. 62.The system of claim 55, wherein the engineered guide nucleic acidcomprises a sequence that is at least 95% complementary to a eukaryoticsequence.
 63. The system of claim 55, wherein the engineered guidenucleic acid comprises a sequence that is at least 80% identical to asequence selected from SEQ ID NOs: 624, 628, 630, 634, 638, 641, 643,645, 646, and 827-929.
 64. The system of claim 55, wherein theengineered guide nucleic acid is a single guide RNA (sgRNA).
 65. Thesystem of claim 55, wherein the engineered guide nucleic acid comprisesone or more modifications selected from a 2′-OMe modification, a2′-Fluoro modification, and a phosphorothioate linkage.
 66. The systemof claim 55, wherein the effector protein comprises a RuvC domain. 67.The system of claim 55, comprising an expression vector, wherein theexpression vector comprises the nucleic acid encoding the effectorprotein, and optionally the DNA molecule encoding the engineered guidenucleic acid.
 68. The system of claim 67, wherein the expression vectoris a viral vector.
 69. The system of claim 68, wherein the viral vectoris an adeno associated viral (AAV) vector.
 70. The system of claim 67,wherein the viral vector is a self-complementary AAV (scAAV) vector 71.The system of claim 55, wherein the nucleic acid encoding the effectorprotein is a messenger RNA (mRNA).
 72. The system of claim 71,comprising a lipid or lipid nanoparticle.
 73. The system of claim 55,comprising a fusion partner protein.
 74. The system of claim 73, whereinthe fusion partner protein is fused or linked to the effector protein.75. The system of claim 73, wherein the fusion partner protein isselected from: a transcriptional activator, a transcriptional repressor,a deaminase, a reverse transcriptase, a methyltransferase, ademethylase, a histone acetyltransferase, a histone deacetylase, adeaminase, a polymerase, and a reverse transcriptase.
 76. The system ofclaim 55, wherein the effector protein comprises a nuclear localizationsignal.
 77. A method of modifying a target nucleic acid, the methodcomprising contacting the target nucleic acid with the system of claim55.
 78. The method of claim 76, comprising contacting a cell with thesystem of claim
 55. 79. The method of claim 77, wherein the cell is amammalian cell.
 80. The method of claim 76, wherein the target nucleicacid comprises double stranded DNA (dsDNA).
 81. The method of claim 76,comprising modifying a nucleobase of the target nucleic acid to adifferent nucleobase.
 82. The method of claim 76, comprising modifying atarget sequence of the target nucleic acid.
 83. One or more nucleic acidvectors encoding at least one of: a) an effector protein, wherein theeffector protein comprises an amino acid sequence that is at least 90%identical to SEQ ID NO: 796 (MEME_4); and b) an engineered guide nucleicacid.
 84. A method of detecting a target nucleic acid in a cell, asubject or a sample, the method comprising contacting the cell, subjector sample with: a) an effector protein, or a nucleic acid encoding aneffector protein, wherein the effector protein comprises an amino acidsequence that is at least 90% identical to SEQ ID NO: 796 (MEME_4); andb) an engineered guide nucleic acid or a DNA molecule encoding theengineered guide nucleic acid.